两种裸指针类型(可以相互转换):
*mut T
:可变裸指针,可以读写指向的内容;*const T
:不可变裸指针,只能读而不能修改指向的内容;
裸指针的主要使用场景是 FFI,如 C 函数的指针类型需要裸指针来声明和传递参数、硬件、高效性能优化等;
裸指针的 unsafe 特性体现在:
- 允许忽略借用规则:可以
同时
拥有同一个内存地址的可变和不可变指针,或者拥有指向同一个地址的多个可变指针; - 不能保证总是指向有效的内存地址;
- 允许为空(null);
创建裸指针:通过引用创建裸指针。
- 可以在安全代码内合法地创建裸指针,但只能在 unsafe block 中解引用裸指针;
- 使用 as 操作符将引用转换为裸指针时(称为 type coercing),需要确保对象是 live 的,而且不通过引用(而只使用 raw pointer)来访问同一个内存,也就是引用和 raw pointer
存取不能交叉
。 - *const T 和 *mut T 之间可以
相互转换
; - raw pointer 不拥有值的所有权;
let mut num = 5;
// 有效的裸指针:可以同时创建的不可变和可变裸指针
let r1 = &num as *const i32;
let r2 = &mut num as *mut i32;
// 可以将 *const T 转换为 *mut T
let r2 = r1 as *mut i32;
let r2 = r1.cast_mut();
// 可能无效的裸指针
let address = 0x012345usize;
let r = address as *const i32;
let my_num: i32 = 10;
// 引用类型自动 type coercing 到 raw pointer
let my_num_ptr: *const i32 = &my_num;
let mut my_speed: i32 = 88;
let my_speed_ptr: *mut i32 = &mut my_speed;
let mut x = 10;
// as 运算符将借用转换为 raw pointer
let ptr_x = &mut x as *mut i32;
let y = Box::new(20);
let ptr_y = &*y as *const i32;
unsafe {
*ptr_x += *ptr_y;
}
assert_eq!(x, 30);
fn very_trustworthy(shared: &i32) {
unsafe {
let mutable = shared as *const i32 as *mut i32;
*mutable = 20;
}
}
// 只能在 unsafe block 中解引用裸指针
unsafe {
println!("r1 is: {}", *r1);
println!("r2 is: {}", *r2);
}
如果要获得 boxed value 的裸指针,需要先解引用 Box 再借用,这并不会获得原始值的 ownership。但是 Box<T> 的 into_raw() 函数消耗 Box 的同时返回一个 raw pointer:
let my_num: Box<i32> = Box::new(10);
let my_num_ptr: *const i32 = &*my_num; // &*my_num 的结果是 &i32,可以转换为 *const i32
let mut my_speed: Box<i32> = Box::new(88);
let my_speed_ptr: *mut i32 = &mut *my_speed;
let my_speed: Box<i32> = Box::new(88);
let my_speed: *mut i32 = Box::into_raw(my_speed); // 转移了 my_speed 的所有权。
unsafe {
drop(Box::from_raw(my_speed));
}
使用宏 std::ptr::addr_of!()
和 std::ptr::addr_of_mut!()
创建 express 的裸指针:
packed struct:默认情况下,struct 对象的 field 会通过 pading 来对齐。通过添加 packed attr,可以 关闭 struct field padding 对齐机制
,这样 struct 的某个 field 可能是未对齐的。对于未对齐的 field 不能创建引用,但是通过 std::ptr::addr_of!() 和 std::ptr::addr_of_mut!() 宏分别创建 未对齐的 *const T 和 *mut T
。
#[derive(Debug, Default, Copy, Clone)]
#[repr(C, packed)]
struct S {
aligned: u8,
unaligned: u32,
}
let s = S::default();
let p = std::ptr::addr_of!(s.unaligned);
// 使用 libc 的 malloc 和 free 来管理裸指针内存
#[allow(unused_extern_crates)]
extern crate libc;
use std::mem;
unsafe {
let my_num: *mut i32 = libc::malloc(mem::size_of::<i32>()) as *mut i32;
if my_num.is_null() {
panic!("failed to allocate memory");
}
libc::free(my_num as *mut libc::c_void);
}
其它创建裸指针的方式:
- 很多类型提供了
as_ptr/as_mut_ptr()
方法来返回它的裸指针; - Owning 指针类型, 如 Box/Rc/Arc 的 into_raw/from_raw() 方法来生成裸指针,或从裸指针创建对象;
- 也可以从 int 创建 raw pointer, 但是非常不安全;
通过对裸指针解引用表达式 *ptr = data
来保存对象时,会对 ptr 指向的 old value 调用 drop
。但是通过裸指针的 write()方法或 std::ptr::write() 来写入新对象时,并不会 drop old value。
裸指针可以是 unaligned 或 null,但是当解引用 raw pointer 时, 它必须是 non-null 和 aligned
,这里的 aligned 是指指针的地址如 *const T 是 std::mem::align_of::<T>() 的倍数。
裸指针可以为 null:
- 创建 null 指针:
- std::ptr::null::<T>() 创建 *const T;
- std::ptr::null_mut::<T>() 创建 *mut T;
- 检查 null 指针:使用 is_null/as_ptr/as_mut_ptr() 方法;
fn option_to_raw<T>(opt: Option<&T>) -> *const T {
match opt {
None => std::ptr::null(),
Some(r) => r as *const T
}
}
assert!(!option_to_raw(Some(&("pea", "pod"))).is_null());
assert_eq!(option_to_raw::<i32>(None), std::ptr::null());
裸指针的一些限制:
- 必须显示解引用,(*raw).field 或 (*raw).method(…);
- raw pointer 不支持 Deref;
- raw pointer 的比较运算,如 == 或 <, 比较的是指针地址, 而非指向的内容;
- raw pointer 没有实现 Display, 但是实现了 Debug 和 Pointer;
- 不支持 raw pointer 的算术运算符, 但是可以使用库函数来进行运算;
- raw pointer 没有实现 Send/Sync, 不能跨线程或 async spawn 中使用;
let trucks = vec!["garbage truck", "dump truck", "moonstruck"];
let first: *const &str = &trucks[0];
let last: *const &str = &trucks[2];
assert_eq!(unsafe { last.offset_from(first) }, 2);
assert_eq!(unsafe { first.offset_from(last) }, -2);
// as 运算符支持将引用转换为 raw pointer(反过来不支持), 但是可能需要多次转换
&vec![42_u8] as *const String; // error: invalid conversion
&vec![42_u8] as *const Vec<u8> as *const String; // permitted
Rust 的 array/slice/vector 都是连续的内存地址块,每个元素占用固定大小的(std::mem::size_of<T>)内存:
fn offset<T>(ptr: *const T, count: isize) -> *const T where T: Sized
{
let bytes_per_element = std::mem::size_of::<T>() as isize;
let byte_offset = count * bytes_per_element;
// 使用 ptr as isize 获得指针的实际值,然后进行数学运算,再将它转回 *const T
(ptr as isize).checked_add(byte_offset).unwrap() as *const T
}
1 裸指针方法 #
裸指针 *const T 和 *mut T 也是 Rust 基本类型,标准库为其定义了一些方法(标准库的 std::ptr module 也提供了一些函数来操作裸指针):
- 是否为 null : pub fn is_null(self) -> bool
- 转换为 U 的指针 : pub const fn cast<U>(self) -> *const U
- 返回指定 count 个对象的偏移指针 : pub const unsafe fn offset(self, count: isize) -> *const T
- 计算相对于指定指针的对象数量 : pub const unsafe fn offset_from(self, origin: *const T) -> isize
- 计算增加 count 个对象的偏移指针 : pub const unsafe fn add(self, count: usize) -> *const T
- 计算减少 count 个对象的偏移指针 : pub const unsafe fn sub(self, count: usize) -> *const T
- 读取指针内容,但是不 move self : pub const unsafe fn read(self) -> T
- 拷贝 count * size_of<T>() 字节,src 和 dest 可以重叠 : pub const unsafe fn copy_to(self, dest: *mut T, count: usize)
- 写入 T 值,但是不 read & drop 原来的值 : pub unsafe fn write(self, val: T)
- 替换 T 值,但是不 read & drop 原来的值 : pub unsafe fn replace(self, src: T) -> T
- 交换两个地址的值 : pub unsafe fn swap(self, with: *mut T)
*const T
实现的方法:
impl<T> *const T where T: ?Sized
pub fn is_null(self) -> bool
let s: &str = "Follow the rabbit";
let ptr: *const u8 = s.as_ptr();
assert!(!ptr.is_null());
pub const fn cast<U>(self) -> *const U // 将 *const T 转换为 *const U
pub const fn cast_mut(self) -> *mut T // 将 *const T 转换为 *mut T
pub fn with_metadata_of<U>(self, meta: *const U) -> *const U where U: ?Sized
pub fn addr(self) -> usize // 返回 *const T 指针值,可以对它进行数学运算
pub fn expose_addr(self) -> usize
pub fn with_addr(self, addr: usize) -> *const T // 使用指定的 addr 创建一个裸指针
pub fn map_addr(self, f: impl FnOnce(usize) -> usize) -> *const T // 使用指定的 f 将自身地址映射到新地址
pub fn to_raw_parts(self) -> (*const (), <T as Pointee>::Metadata) // 返回自身指针和 Metadata
pub unsafe fn as_ref<'a>(self) -> Option<&'a T> // 如果指针是 null 则返回 None
pub unsafe fn as_uninit_ref<'a>(self) -> Option<&'a MaybeUninit<T>>
let ptr: *const u8 = &10u8 as *const u8;
unsafe {
if let Some(val_back) = ptr.as_ref() {
println!("We got back the value: {val_back}!");
}
}
// 计算经过 count 个对象后的 offset 地址,offset 起始地址和加了 count*size 的结束地址都必须位于已
// 分配对象的内存范围内,否则是 UB。
pub const unsafe fn offset(self, count: isize) -> *const T
let s: &str = "123";
let ptr: *const u8 = s.as_ptr();
unsafe {
println!("{}", *ptr.offset(1) as char);
println!("{}", *ptr.offset(2) as char);
}
// 和 offset() 类似,但是使用字节数偏移
pub const unsafe fn byte_offset(self, count: isize) -> *const T
// wrapping_offset 和 offset 相比,不要求起始和结束地址都位于已分配对象的内存范围内。
pub const fn wrapping_offset(self, count: isize) -> *const T
pub const fn wrapping_byte_offset(self, count: isize) -> *const T
let data = [1u8, 2, 3, 4, 5];
let mut ptr: *const u8 = data.as_ptr();
let step = 2;
let end_rounded_up = ptr.wrapping_offset(6);
// 打印 "1, 3, 5, "
while ptr != end_rounded_up {
unsafe {
print!("{}, ", *ptr);
}
ptr = ptr.wrapping_offset(step);
}
pub fn mask(self, mask: usize) -> *const T
pub const unsafe fn offset_from(self, origin: *const T) -> isize // 返回 self 和 origin 之间的元素数量
pub const unsafe fn byte_offset_from<U>(self, origin: *const U) -> isize where U: ?Sized, // 返回 byte 数量
let a = [0; 5];
let ptr1: *const i32 = &a[1];
let ptr2: *const i32 = &a[3];
unsafe {
assert_eq!(ptr2.offset_from(ptr1), 2);
assert_eq!(ptr1.offset_from(ptr2), -2);
assert_eq!(ptr1.offset(2), ptr2);
assert_eq!(ptr2.offset(-2), ptr1);
}
// add/sub/sub_ptr() 的关系:
// ptr.sub_ptr(origin) == count
// origin.add(count) == ptr
// ptr.sub(count) == origin
pub unsafe fn sub_ptr(self, origin: *const T) -> usize // 返回两个指针之间的元素数量
pub fn guaranteed_eq(self, other: *const T) -> Option<bool>
pub fn guaranteed_ne(self, other: *const T) -> Option<bool>
// 返回增加 count 个元素后新的地址
pub const unsafe fn add(self, count: usize) -> *const T
pub const unsafe fn byte_add(self, count: usize) -> *const T
let s: &str = "123";
let ptr: *const u8 = s.as_ptr();
unsafe {
println!("{}", *ptr.add(1) as char);
println!("{}", *ptr.add(2) as char);
}
// Calculates the offset from a pointer (convenience for .offset((count as
// isize).wrapping_neg())). count is in units of T; e.g., a count of 3 represents a pointer offset
// of 3 * size_of::<T>() bytes.
pub const unsafe fn sub(self, count: usize) -> *const T
pub const unsafe fn byte_sub(self, count: usize) -> *const T
pub const fn wrapping_add(self, count: usize) -> *const T
pub const fn wrapping_byte_add(self, count: usize) -> *const T
pub const fn wrapping_sub(self, count: usize) -> *const T
pub const fn wrapping_byte_sub(self, count: usize) -> *const T
// Reads the value from self without moving it. This leaves the memory in self unchanged. See
// ptr::read for safety concerns and examples.
pub const unsafe fn read(self) -> T
pub unsafe fn read_volatile(self) -> T
pub const unsafe fn read_unaligned(self) -> T
// Copies count * size_of<T> bytes from self to dest. The source and destination may
// overlap. NOTE: this has the same argument order as ptr::copy. See ptr::copy for safety concerns
// and examples.
pub const unsafe fn copy_to(self, dest: *mut T, count: usize)
pub const unsafe fn copy_to_nonoverlapping(self, dest: *mut T, count: usize)
pub fn align_offset(self, align: usize) -> usize
pub fn is_aligned(self) -> bool
pub fn is_aligned_to(self, align: usize) -> bool
// *mut T 类型是在 *const T 的方法基础上,增加了一些 write 相关的方法
impl<T> *mut T where T: ?Sized,
// ...
pub unsafe fn as_mut<'a>(self) -> Option<&'a mut T>
pub unsafe fn as_uninit_mut<'a>(self) -> Option<&'a mut MaybeUninit<T>>
// ...
// Executes the destructor (if any) of the pointed-to value. See ptr::drop_in_place for safety
// concerns and examples.
pub unsafe fn drop_in_place(self)
// Overwrites a memory location with the given value without reading or dropping the old value. See
// ptr::write for safety concerns and examples.
pub unsafe fn write(self, val: T)
pub unsafe fn write_bytes(self, val: u8, count: usize)
pub unsafe fn write_volatile(self, val: T)
pub unsafe fn write_unaligned(self, val: T)
pub unsafe fn replace(self, src: T) -> T
pub unsafe fn swap(self, with: *mut T)
pub fn align_offset(self, align: usize) -> usize
pub fn is_aligned(self) -> bool
pub fn is_aligned_to(self, align: usize) -> bool
*mut T
实现的方法(包含 *const T 实现的方法)
impl<T> *mut T Where T: ?Sized,
pub fn is_null(self) -> bool
pub const fn cast<U>(self) -> *mut U
pub const fn cast_const(self) -> *const T
pub unsafe fn as_ref<'a>(self) -> Option<&'a T>
pub unsafe fn as_mut<'a>(self) -> Option<&'a mut T>
// Reads the value from self without moving it. This leaves the memory in self unchanged.
pub const unsafe fn read(self) -> T
// Copies count * size_of<T> bytes from self to dest. The source and destination may overlap.
pub unsafe fn copy_to(self, dest: *mut T, count: usize)
// Copies count * size_of<T> bytes from src to self. The source and destination may overlap.
pub unsafe fn copy_from(self, src: *const T, count: usize)
// Executes the destructor (if any) of the pointed-to value.
pub unsafe fn drop_in_place(self)
// Overwrites a memory location with the given value without reading or dropping the old value.
pub unsafe fn write(self, val: T)
// Replaces the value at self with src, returning the old value, without dropping either.
pub unsafe fn replace(self, src: T) -> T
// Swaps the values at two mutable locations of the same type, without deinitializing either. They
// may overlap, unlike mem::swap which is otherwise equivalent.
pub unsafe fn swap(self, with: *mut T)