Skip to main content

Rust-个人参考手册

··99272 字·
rust rust
Table of Contents
rust-lang - 这篇文章属于一个系列。
§ 1: 本文

1 comment
#

https://doc.rust-lang.org/reference/comments.html

常规注释(不显示在 cargo doc 中):

  1. // :单行注释,注释到行尾;
  2. /* */ : 块注释;

cargo doc 注释:

  1. INNER LINE DOC: //!
  2. INNER BLOCK DOC: /*!
  3. OUTER LINE DOC: ///
  4. OUTER BLOCK DOC: /** */

INNER 是 module/crate 级别的注释,而 OUTER 是紧接着的 item 的注释。

//! A doc comment that applies to the implicit anonymous module of this crate

pub mod outer_module {

    //!  - Inner line doc
    //!! - Still an inner line doc (but with a bang at the beginning)

    /*!  - Inner block doc */
    /*!! - Still an inner block doc (but with a bang at the beginning) */

    //   - Only a comment
    ///  - Outer line doc (exactly 3 slashes)
    //// - Only a comment

    /*   - Only a comment */
    /**  - Outer block doc (exactly) 2 asterisks */
    /*** - Only a comment */

    pub mod inner_module {}

    pub mod nested_comments {
        /* In Rust /* we can /* nest comments */ */ */

        // All three types of block comments can contain or be nested inside any other type:

        /*   /* */  /** */  /*! */  */
        /*!  /* */  /** */  /*! */  */
        /**  /* */  /** */  /*! */  */
        pub mod dummy_item {}
    }

    pub mod degenerate_cases {
        // empty inner line doc
        //!

        // empty inner block doc
        /*!*/

        // empty line comment
        //

        // empty outer line doc
        ///

        // empty block comment
        /**/

        pub mod dummy_item {}

        // empty 2-asterisk block isn't a doc block, it is a block comment
        /***/

    }

    /* The next one isn't allowed because outer doc comments
       require an item that will receive the doc */

    /// Where is my item?
    #[warn(dead_code)]
    pub fn test(){}
}

2 type
#

  1. 基础类型:

    • Boolean — bool
    • Numeric — integer and float
    • Textual — char and str
    • Never — ! — a type with no values
  2. 序列类型:

    • Tuple
    • Array
    • Slice
  3. 用户自定义类型:

    • Struct
    • Enum
    • Union
  4. 函数类型:

    • Functions
    • Closures
  5. 指针类型:

    • References
    • Raw pointers
    • Function pointers
  6. Trait 类型:

    • Trait objects
    • Impl trait

type alias 是原样替换,并没有引入新类型,所以可以按照本来的方式使用别名,它可以提升代码的可读性。

type Thunk = Box<dyn Fn() + Send + 'static>;
let f: Thunk = Box::new(|| println!("hi"));
fn takes_long_type(f: Thunk) {
    // --snip--
}
fn returns_long_type() -> Thunk {
    // --snip--
}

type Result<T> = std::result::Result<T, std::io::Error>;

type Meters = u32;
let x: u32 = 5;
let y: Meters = 5;
println!("x + y = {}", x + y);  // Meters 是 u32 的 alias,还具有 u32 的所有操作。

栈变量类型:

  1. 原始值;
  2. array/struct/tuple/enum/union

堆变量类型:

  1. 字符串:String
  2. 容器:Vec/HashMap/HashSet
  3. Slice
  4. 智能指针:Box/Rc/Arc/

3 variable
#

Rust 使用 let 关键字声明变量。默认情况下,Rust 变量是不可变的(immutable)。如果需要可变性,可以选择使用 mut 关键字来声明变量。

fn main() {
    let _immutable_binding = 1;
    let mut mutable_binding = 1;

    // Ok
    println!("Before mutation: {}", mutable_binding);
    mutable_binding += 1;
    println!("After mutation: {}", mutable_binding);

    // Error! Cannot assign a new value to an immutable variable
    _immutable_binding += 1;
}

Rust 是强类型静态语言,每个变量都需要有明确的类型,但一般情况下不需要明确指定而是由编译器推导。Rust 根据当前赋值或后续操作、赋值等情况,对变量的类型进行推导:

let var: type = expression;  // 指定变量值类型
let var = expression; // 由编译器根据 expresion 结果或者后续对 var 的使用方式进行推导。

fn main() {
    // Because of the annotation, the compiler knows that `elem` has type u8.
    let elem = 5u8;

    // Create an empty vector (a growable array).
    let mut vec = Vec::new();

    // At this point the compiler doesn't know the exact type of `vec`, it just knows that it's a
    // vector of something (`Vec<_>`).

    // Insert `elem` in the vector.
    vec.push(elem);

    // Aha! Now the compiler knows that `vec` is a vector of `u8`s (`Vec<u8>`)
    println!("{:?}", vec);
}

变量必须被声明并且初始化后才能使用。也可以先声明,后续再初始化(不建议):

fn main() {
    // Declare a variable binding,但是未初始化(注意,即使指定 mut,也可以初始化一次)。
    let a_binding;

    {
        let x = 2;
        // Initialize the binding
        a_binding = x * x; // 变量被首次初始化,后续才可以开始使用。
    }
    println!("a binding: {}", a_binding);

    let another_binding;
    // Error! Use of uninitialized binding
    println!("another binding: {}", another_binding);

    another_binding = 1;
    println!("another binding: {}", another_binding);
}

// 另一个例子
let name; // 先声明
if user.has_nickname() { // 复杂条件判断, 在初始化变量:
    name = user.nickname(); }
else {
    name = generate_unique_name();
    user.register(&name);
}

// 变量必须先初始化才能使用:
let x: u32;
let y = x + 1;

// error[E0381]: used binding `x` isn't initialized
//  --> src/main.rs:3:9
//   |
// 2 | let x: u32;
//   |     - binding declared here but left uninitialized
// 3 | let y = x + 1;
//   |         ^ `x` used here but it isn't initialized
//   |
// help: consider assigning a value
//   |
// 2 | let x: u32 = 0;
//   |            +++

Rust 的 block 可以返回值, 所以可以用于复杂变量值的初始化:

let display_name = match post.author() {
    Some(author) => author.name(),
    None => {
        let network_info = post.get_network_metadata()?;
        let ip = network_info.client_address();
        ip.to_string()  // 该 block 最后一条语句没有分号, 作为 block 的返回值
    }
};

let msg = {
    // let-declaration: semicolon is always required
    let dandelion_control = puffball.open();
    // expression + semicolon: method is called, return value
    dropped
    dandelion_control.release_all_seeds(launch_codes);
    // expression with no semicolon: method is called, return value stored in `msg`
    dandelion_control.get_status()
};

但是如果 if 表达式没有用于赋值, 则 block 不能有返回值(必须是 ())

// ok
let suggested_pet = if with_wings { Pet::Buzzard } else { Pet::Hyena };

// error
if preferences.changed() {
    page.compute_size()  // oops, missing semicolon
}
// error[E0308]: mismatched types
//   22 |         page.compute_size()  // oops, missing semicolon
//       |         ^^^^^^^^^^^^^^^^^^^- help: try adding a semicolon:
//   `;`
// ||
// | expected (), found tuple |
// = note: expected unit type `()`
//                 found tuple `(u32, u32)`

Rust 中的每个变量默认 都需要被使用 ,否则编译器会警告,可以在变量名前加 _ 来表明该变量可能不被使用:

fn main() {
    let an_integer = 1u32;
    let a_boolean = true;
    let unit = ();

    println!("A boolean: {:?}", a_boolean);
    println!("Meet the unit value: {:?}", unit);

    // The compiler warns about unused variable bindings; these warnings can
    // be silenced by prefixing the variable name with an underscore
    let _unused_variable = 3u32;
}

变量绑定是有一个 scope 的,默认是所在的 block:

  • 变量可以被 shadow,shadow 并不会 drop 前面变量的值,shadown 可以为同名变量指定不同的可变性和变量值类型。
  • 如果 shadow 使用 同名的变量名 ,则非 mut 变量可以将前面同名的 mut 变量 freezing,即不可修改。
fn main() {
    let x = 5;
    let x = x + 1; // 遮蔽第一个 x
    {
        let x = x * 2; // 第三个 x 遮蔽了第二个 x
        println!("The value of x in the inner scope is: {}", x);
    }
    println!("The value of x is: {}", x);
}

fn main() {
    let shadowed_binding = 1;

    {
        println!("before being shadowed: {}", shadowed_binding);
        // This binding *shadows* the outer one
        let shadowed_binding = "abc"; // 变量 shadow 前面(无论是否是同一个 block)的同名变量,类型也可以不同。
        println!("shadowed in inner block: {}", shadowed_binding);
    }
    println!("outside inner block: {}", shadowed_binding);

    // This binding *shadows* the previous binding
    let shadowed_binding = 2;
    println!("shadowed in outer block: {}", shadowed_binding);
}

// freezing
fn main() {
    let mut _mutable_integer = 7i32;
    {
        // Shadowing by immutable `_mutable_integer`
        let _mutable_integer = _mutable_integer;
        // Error! `_mutable_integer` is frozen in this scope
        _mutable_integer = 50;
        // `_mutable_integer` goes out of scope
    }
    // Ok! `_mutable_integer` is not frozen in this scope
    _mutable_integer = 3;
}

// 同名变量遮蔽,可以减少一个变量定义
for line in file.lines() {
    let line = line?;
    // ...
}
// 未使用变量遮蔽的情况
for line_result in file.lines() {
    let line = line_result?;
    // ...
}

Rust 默认不对自动做类型间隐式转换(coercion),但是可以通过 as 关键字做显式的类型转换(casting),转换规则和 C 类似。as 可用于如下类型转换:

  1. primitive 类型;
  2. trait object 类型;
  3. 裸指针类型
  4. 类型协变 type coercions支持的转换场景;
let circle = Box::new(circle) as Box<dyn Circle>; // circle 可以 usized 协变到 dyn Circle 所以 OK;
let nonsense = circle.radius() * circle.area();

let a = *const [u16] as *const [u8]

#![allow(overflowing_literals)]

fn main() {
    let decimal = 65.4321_f32;

    // Error! No implicit conversion
    let integer: u8 = decimal;

    // Explicit conversion
    let integer = decimal as u8;
    let character = integer as char;

    // Error! There are limitations in conversion rules.  A float cannot be directly converted to a
    // char.
    let character = decimal as char;

    println!("Casting: {} -> {} -> {}", decimal, integer, character);

    // when casting any value to an unsigned type, T,
    // T::MAX + 1 is added or subtracted until the value
    // fits into the new type

    // 1000 already fits in a u16
    println!("1000 as a u16 is: {}", 1000 as u16);

    // 1000 - 256 - 256 - 256 = 232
    // Under the hood, the first 8 least significant bits (LSB) are kept,
    // while the rest towards the most significant bit (MSB) get truncated.
    println!("1000 as a u8 is : {}", 1000 as u8);
    // -1 + 256 = 255
    println!("  -1 as a u8 is : {}", (-1i8) as u8);

    // For positive numbers, this is the same as the modulus
    println!("1000 mod 256 is : {}", 1000 % 256);

    // When casting to a signed type, the (bitwise) result is the same as
    // first casting to the corresponding unsigned type. If the most significant
    // bit of that value is 1, then the value is negative.

    // Unless it already fits, of course.
    println!(" 128 as a i16 is: {}", 128 as i16);

    // In boundary case 128 value in 8-bit two's complement representation is -128
    println!(" 128 as a i8 is : {}", 128 as i8);

    // repeating the example above
    // 1000 as u8 -> 232
    println!("1000 as a u8 is : {}", 1000 as u8);
    // and the value of 232 in 8-bit two's complement representation is -24
    println!(" 232 as a i8 is : {}", 232 as i8);

    // Since Rust 1.45, the `as` keyword performs a *saturating cast*
    // when casting from float to int. If the floating point value exceeds
    // the upper bound or is less than the lower bound, the returned value
    // will be equal to the bound crossed.

    // 300.0 as u8 is 255
    println!(" 300.0 as u8 is : {}", 300.0_f32 as u8);
    // -100.0 as u8 is 0
    println!("-100.0 as u8 is : {}", -100.0_f32 as u8);
    // nan as u8 is 0
    println!("   nan as u8 is : {}", f32::NAN as u8);

    // This behavior incurs a small runtime cost and can be avoided with unsafe methods, however the
    // results might overflow and return **unsound values**. Use these methods wisely:
    unsafe {
        // 300.0 as u8 is 44
        println!(" 300.0 as u8 is : {}", 300.0_f32.to_int_unchecked::<u8>());
        // -100.0 as u8 is 156
        println!("-100.0 as u8 is : {}", (-100.0_f32).to_int_unchecked::<u8>());
        // nan as u8 is 0
        println!("   nan as u8 is : {}", f32::NAN.to_int_unchecked::<u8>());
    }
}

其他对于自定义类型,如 struct/enum,需要使用 From/Into/TryFrom/TryInto/AsRef/AsMut trait 的方法来转换。

4 scalar
#

Scalar 类型如下:

signed integers
i8, i16, i32, i64, i128, isize (pointer size),默认为 i32;
unsigned integers
u8, u16, u32, u64, u128, usize (pointer size)
floating point
f32, f64, 默认为 f64;
char
Unicode 字符,如 ‘a’, ‘α’ and ‘∞’ (占用 4 bytes, UTF32)
bool
true/false, 占用 1 byte;
the unit type ()
只有一个空值 ();
Never
!

对于数值变量,没有指定类型时默认为 i32 和 f64,字面量可以加类型后缀, 如 23u8, 12.3f64,数字/类型后缀之间可以加下划线, 如 2_3_u8 等效于 23u8,可以使用 0b/0o/0x 表示整型(只能使用小写字母前缀)。

fn main() {
    let remainder = 43.0 % 5.0; // 浮点取模运算, 截断除法


    let logical: bool = true;
    let a_float: f64 = 1.0;
    let an_integer   = 5i32;
    let default_float   = 3.0; // `f64`
    let default_integer = 7;   // `i32`

    // 类型推导(从后续的赋值语句推导出类型为 i64)。
    let mut inferred_type = 12;
    inferred_type = 4294967296i64;

    let mut mutable = 12;
    mutable = 21;

    // Error! The type of a variable can't be changed.
    //mutable = true;

    // Variables can be overwritten with shadowing.
    let mutable = true;
}

整数溢出:在 debug 构建中,Rust 检查整数溢出并导致 panic。在 release 构建中,溢出不会被检查,并可能导致 “环绕” 行为。

fn main() {
    let x: u8 = 255;
    // 使用 wrapping_add 可以防止 panic
    let y: u8 = x.wrapping_add(1);
    println!("y: {}", y);
    // 输出: y: 0
}

Rust 不会为原始类型做隐式的转换,需要使用 as 表达式来显式转换。as 是后缀运算符,优先级非常高。类型转换时,浮点数转换为整数时小数部分将被截断(不进行四舍五入)。

fn main() {
    let decimal = 97.123_f32;
    let integer: u8 = decimal as u8;
    let c1: char = decimal as char;
    let c2 = integer as char;

    let integer: u32 = 5;
    let float: f64 = 3.0;
    let int_to_float = integer as f64; // 5.0
    // 浮点数转换为整数,小数部分被截断
    let float_to_int = float as u32; // 3
}

复杂类型转换需要使用 From/Into/TryFrom/TryInto/AsRef/AsMut trait。from() 则通常用于无风险的转换,它不会产生错误。try_from() 方法会返回一个 Result 类型,当转换失败时(例如,因为类型溢出或数据丢失),它会返回一个错误。

use std::convert::TryInto;

fn main() {
    let decimal = 65.4321_f64;

    // 使用 `try_into` 方法进行安全转换
    let integer: u8 = decimal.try_into().unwrap_or_default(); // 出错时返回缺省值 0

    // 使用 `from` 方法进行安全转换
    let integer_from = u8::from(42); // 因为 42 可以安全地转换为 `u8`
    let string_from = String::from("just for test");

    println!("Safe casting: {} -> {}", decimal, integer);
    println!("From casting: {}", integer_from);
}

单元类型 Unit Type:() 既是类型也是唯一值。主要作为函数的返回类型,表明该函数不返回任何数据:

fn main() {
    println!("{:p}, {:p}", &(), &()); // 打印地址相同, 说明是唯一类型值

    print_message();

    // 显式使用单元类型和单元值
    let my_unit: () = ();

    // 函数参数接受单元类型值
    take_unit(());

    // 泛型类型也可以使用单元类型, 常用于不需要返回实际值的 Ok.
    let result: Result<(), &str> = Ok(());
    match result {
        Ok(_) => println!("Operation was successful."),
        Err(e) => println!("Error occurred: {}", e),
    }
}

fn print_message() {
    println!("Hello, world!");
    // 这个函数隐式返回单元类型 `()`
}

fn take_unit(_unit: ()) {
    println!("This function takes a unit type.");
}

5 textual
#

textual 类型:char/str/String/OsStr/OsString/CStr/CString

char 是固定 4 bytes 的 Unicode 字符码点(UTF-32), 可以使用 as 在 u8/u32 相互转换。使用 as 将 char 转换为整型的字符码点, 使用 std::char::from_u32() 将码点转换为 char;

fn main() {
    let emoji: char = '😂';
    let chinese_character: char = '中';

    // 遍历字符串 &str 中的 char 字符
    let word = "Rust语言";
    for ch in word.chars() {
        println!("{}", ch);
    }

    // 将字符转换为对应的 Unicode 代码点
    let unicode_codepoint = '🦀' as u32;
    println!("The Unicode code point of '🦀' is: U+{:X}", unicode_codepoint);
    let character_from_codepoint = std::char::from_u32(unicode_codepoint).unwrap_or_default();
    println!("The character from code point U+{:X} is: '{}'", unicode_codepoint, character_from_codepoint);
}

str 是原始类型,对应一块 [u8] 连续内存区域,保存的是字符串的 UTF-8 编码值。str 编译时大小未知,一般不能直接作为变量类型使用,而是使用借用类型 &str 或智能指针 Box<str> 类型:

  • &str 是 fat pointer,包括指向内存区域的地址的指针和字符的数量(长度)
use std::slice;
use std::str;

let story = "Once upon a time...";
let ptr = story.as_ptr(); // 指向内存区域的 *const raw pointer
let len = story.len();
assert_eq!(19, len);
// We can re-build a str out of ptr and len. This is all unsafe because we are responsible for
// making sure the two components are valid:
let s = unsafe {
    // First, we build a &[u8]...
    let slice = slice::from_raw_parts(ptr, len);
    // ... and then convert that slice into a string slice
    str::from_utf8(slice)
};
assert_eq!(s, Ok(story));


// 使用智能指针保存 str
let boxed: Box<str> = Box::from("hello");
assert_eq!(Cow::from("eggplant"), Cow::Borrowed("eggplant"));
let shared: Rc<str> = Rc::from("statue");

// 从 &str 创建 Vec<u8>
assert_eq!(Vec::from("123"), vec![b'1', b'2', b'3']);

Rust 字符串字面量类型是 &‘static str:

fn main() {
    let hello_world = "Hello, World!";
    //  等效于
    let hello_world: &'static str = "Hello, world!";

    //let s: str = "hello, world"; // 错误,str 不能直接作为类型
    let s: &str = "hello, world"; // OK

    // 在堆上分配字符串内存,s 拥有该对象
    let s: Box<str> = "hello, world".into();
    greetings(&s); // Box<str> 实现了 Deref<Target=str>, 所以 &Box<str> 等效于 &str

    // 使用 &'static str 可以避免为 struct 指定 lifetime 参数
    struct Anime { name: &'static str, bechdel_pass: bool };
    let aria = Anime { name: "Aria: The Animation", bechdel_pass: true };

    // &str 不能自动协变到 &[u8], 可以使用 as_bytes() 转换为 &[u8]
    let bytes = "bors".as_bytes();
    assert_eq!(b"bors", bytes);
}

fn greetings(s: &str) {
    println!("{}",s);
}

其他:

  1. char 是固定的 4 bytes 长度的 Unicode 码点;
  2. b’x’: byte char,字符 x 的 UTF-8 编码值(u8 类型), 如 104 == b’h’;
  3. b"xyz": byte string,&[u8; N] 数据借用类型,如 &[‘x’, ‘y’, ‘z’];
  4. r###"\a\b\c"###: raw string,不对字符串内容转义,r 后面的 # 数量可变, 但只能使用连续的 # 字符;
  5. br##"\a\b\c\t\n"##: raw byte string,类型为 &[u8, 10], 不对字符串转义,必须是 br 而不能是 rb;
  6. c"hello":C string,以 NULL 结尾的 C 字符串。
  7. cr#“hello”#:raw C string,以 NULL 结尾的 C 原生字符串。

byte string 是 u8 类型的数组的借用 &[u8; N],可以当作 &[u8] 使用:

let method = b"GET";
assert_eq!(method, &[b'G', b'E', b'T']);

字符串可以包含换行, 转义字符(如 \x23, \u{211D}), 默认左对齐, 行尾如果是 \ 字符, 则删除换行符:

  • 转义字符包括:\xaf, \n, \r, \t, \\, \0, \’, \", \u{0}, \u{00}, …, \u{000000}, 不包括二进制和八进制。
let s1 = String::from("hello,");
println!("#{:20.20}#", s1); // 字符串显示默认左对齐(数字是右对齐),显示: #hello,              #

println!("{}", "a\t
      b  \
       c d
      ef
      ");

String 和 &str 的 Index 操作返回 &str, 但是需要保证 &s[i..j] 的 i..j 是有效的字符边界,否则 panic,可以使用 non-panicking 版本 get();

  • s[i] 是禁止的,因为 String/&str 是 UTF-8 编码,返回 &u8 可能是无意义的;
let s = String::from("hello world");
let hello = &s[0..5]; // &str 类型
println!("{}", hello);

let s = "hello";
// println!("The first letter of s is {}", s[0]); // 错误,不支持 s[0];

// 可以使用 as_bytes() 方法将 String/&str 转换为 &[u8], 然后再 index 某个 u8:
let s = "hello";
assert_eq!(s.as_bytes()[0], 104);
assert_eq!(s.as_bytes()[0], b'h');

let s = "💖💖💖💖💖";
assert_eq!(s.as_bytes()[0], 240);

内存布局:

  1. String:指向堆内存的指针,内存的长度(bytes)),内存的容量(bytes)。
  2. &str:指向 slice 内存的指针,slice 的长度(bytes)

str 和 String 都是 严格遵守 UTF-8 编码的 ,但是对于一些操作系统文件名或路径,可以不是 UTF-8 编码的字符串,所以 Rust 引入了 std::ffi::OsStr/OsString 类型:

  • OsStr 是 unsized type,一般需要和 & 和 Box 使用,不可以改变,类似于 str;
  • OsString 是 sized type,是 OsStr 的 Owned 类型,可以修改,类似于 String;
  • OsString 实现了 Deref<target = OsStr>, 所以 &OsString 可以使用 &OsStr 定义的所有方法。
use std::ffi::OsStr;
let os_str = OsStr::new("foo");

OsStr 的方法:

  • pub fn as_encoded_bytes(&self) -> &[u8]
  • pub fn into_os_string(self: Box<OsStr>) -> OsString
  • pub fn make_ascii_lowercase(&mut self)
  • pub fn to_os_string(&self) -> OsString
  • pub fn to_str(&self) -> Option<&str>
  • pub fn to_string_lossy(&self) -> Cow<’_, str>

OsStr/OsString 都不是 NULL 终止的字符串, 类型 std::ffi::CStr 和 std::ffi::CString 是 C 风格的 NULL 终止的字符串。

CStr 也有字面量形式:

  • c"hello":以 NULL 结尾的 C 原生字符串。
  • cr#“hello”#:以 NULL 结尾的 C 原生字符串。
use std::ffi::CString;
use std::os::raw::c_char;

fn main() {
    let s = String::from("Hello, world!");
    let cs = CString::new(s).unwrap();

    let p = cs.as_ptr() as *const c_char;
    println!("Address: {:?}", p);
}

5.1 str 方法
#

  1. len() : 返回 bytes 数量。
  2. is_empty()
  3. is_char_boundary()
  4. as_bytes() -> &[u8]
  5. as_ptr()/as_mut_ptr():返回 raw pointer: *const 和 *mut
  6. get()/get_mut(): 安全返回子串;
  7. chars()/bytes():返回 char 和 byte 的迭代器;
  8. split_whitespace(): 返回空白字符分割的子串迭代器,连续的空白字符等效为一个;
  9. lines(): 返回行迭代器,行尾不包括换行;
  10. contains()/starts_with()/ends_with(): 检查 pattern,pattern 支持多种类型;
  11. find()/rfind(): 返回匹配 pattern 的 index;
  12. match()/rmatch(): 返回匹配 pattern 的子串迭代器;
  13. trim_XX()/strip_XX(): 删除空格、删除前后缀;
  14. parse<T>: 将字符串转换为 T 类型,T 必须要实现 FromStr trait;
  15. replace(): 将 pattern 替换为子串;
  16. into_string()/to_string(): 将 &str 转换为 String;
// 返回字符串的 bytes(而非字符)长度
pub const fn len(&self) -> usize
let len = "foo".len();
assert_eq!(3, len); // 字节长度
assert_eq!("ƒoo".chars().count(), 3); // 字符数量

pub const fn is_empty(&self) -> bool
pub fn is_char_boundary(&self, index: usize) -> bool

// Finds the closest x not exceeding index where is_char_boundary(x) is true.
pub fn floor_char_boundary(&self, index: usize) -> usize
#![feature(round_char_boundary)]
let s = "❤️🧡💛💚💙💜";
assert_eq!(s.len(), 26);
assert!(!s.is_char_boundary(13));
let closest = s.floor_char_boundary(13);
assert_eq!(closest, 10);
assert_eq!(&s[..closest], "❤️🧡");

pub fn ceil_char_boundary(&self, index: usize) -> usize

// 转换为 slice 借用
pub const fn as_bytes(&self) -> &[u8]
let bytes = "bors".as_bytes();
assert_eq!(b"bors", bytes);

pub unsafe fn as_bytes_mut(&mut self) -> &mut [u8]
let mut s = String::from("Hello");
let bytes = unsafe { s.as_bytes_mut() };
assert_eq!(b"Hello", bytes);

pub const fn as_ptr(&self) -> *const u8
pub fn as_mut_ptr(&mut self) -> *mut u8
let s = "Hello";
let ptr = s.as_ptr();

// 安全返回一个子字符串 &str,如果不在字符串边界,返回 None
pub fn get<I>(&self, i: I) -> Option<&<I as SliceIndex<str>>::Output> where I: SliceIndex<str>,
pub fn get_mut<I>( &mut self, i: I) -> Option<&mut <I as SliceIndex<str>>::Output> where I: SliceIndex<str>,
let v = String::from("🗻∈🌏");
assert_eq!(Some("🗻"), v.get(0..4));
// indices not on UTF-8 sequence boundaries
assert!(v.get(1..).is_none());
assert!(v.get(..8).is_none());
// out of bounds
assert!(v.get(..42).is_none());

// 返回一个子字符串 &str,调用者确保传入的 index 范围是有效的。
pub unsafe fn get_unchecked<I>(&self, i: I) -> &<I as SliceIndex<str>>::Output where I: SliceIndex<str>,
pub unsafe fn get_unchecked_mut<I>( &mut self, i: I ) -> &mut <I as SliceIndex<str>>::Output where I: SliceIndex<str>,

// 分割字符串
pub fn split_at(&self, mid: usize) -> (&str, &str)
pub fn split_at_mut(&mut self, mid: usize) -> (&mut str, &mut str)
pub fn split_at_checked(&self, mid: usize) -> Option<(&str, &str)>
pub fn split_at_mut_checked( &mut self, mid: usize ) -> Option<(&mut str, &mut str)>

// 返回字符串的 char 或 byte 迭代器
pub fn chars(&self) -> Chars<'_> 
pub fn char_indices(&self) -> CharIndices<'_>
pub fn bytes(&self) -> Bytes<'_>

// 返回空白字符分割的子字符串迭代器
pub fn split_whitespace(&self) -> SplitWhitespace<'_>
pub fn split_ascii_whitespace(&self) -> SplitAsciiWhitespace<'_>
let mut iter = " Mary   had\ta\u{2009}little  \n\t lamb".split_whitespace();
assert_eq!(Some("Mary"), iter.next());
assert_eq!(Some("had"), iter.next());
assert_eq!(Some("a"), iter.next());
assert_eq!(Some("little"), iter.next()); // 多个连续空白字符视为一个
assert_eq!(Some("lamb"), iter.next());
assert_eq!(None, iter.next());
assert_eq!("".split_whitespace().next(), None);
assert_eq!("   ".split_whitespace().next(), None);

// 返回行迭代器,如果是空行则返回空字符串,不包括行尾的换行
pub fn lines(&self) -> Lines<'_>
pub fn lines_any(&self) -> LinesAny<'_>
let text = "foo\nbar\n\r\nbaz";
let mut lines = text.lines();
assert_eq!(Some("foo"), lines.next());
assert_eq!(Some("bar"), lines.next());
assert_eq!(Some(""), lines.next());
assert_eq!(Some("baz"), lines.next());
assert_eq!(None, lines.next());

pub fn encode_utf16(&self) -> EncodeUtf16<'_> 

// 是否包含 pattern
pub fn contains<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>,

// 是否以 pattern 开始或结束
pub fn starts_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>,
pub fn ends_with<'a, P>(&'a self, pat: P) -> bool where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,

// find 返回匹配 pattern 的 index,如果为找到则返回 None
pub fn find<'a, P>(&'a self, pat: P) -> Option<usize> where P: Pattern<'a>,
pub fn rfind<'a, P>(&'a self, pat: P) -> Option<usize> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,

// 拆分字符串为可迭代的子串 &str
pub fn split<'a, P>(&'a self, pat: P) -> Split<'a, P> where P: Pattern<'a>,
pub fn split_inclusive<'a, P>(&'a self, pat: P) -> SplitInclusive<'a, P> where P: Pattern<'a>,
pub fn rsplit<'a, P>(&'a self, pat: P) -> RSplit<'a, P> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,
pub fn split_terminator<'a, P>(&'a self, pat: P) -> SplitTerminator<'a, P> where P: Pattern<'a>,
pub fn rsplit_terminator<'a, P>(&'a self, pat: P) -> RSplitTerminator<'a, P> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,
pub fn splitn<'a, P>(&'a self, n: usize, pat: P) -> SplitN<'a, P> where    P: Pattern<'a>,
pub fn rsplitn<'a, P>(&'a self, n: usize, pat: P) -> RSplitN<'a, P> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,
pub fn split_once<'a, P>(&'a self, delimiter: P) -> Option<(&'a str, &'a str)> where P: Pattern<'a>,
pub fn rsplit_once<'a, P>(&'a self, delimiter: P) -> Option<(&'a str, &'a str)> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,

let v: Vec<&str> = "Mary had a little lamb".split(' ').collect();
assert_eq!(v, ["Mary", "had", "a", "little", "lamb"]);
let v: Vec<&str> = "".split('X').collect();
assert_eq!(v, [""]);
let v: Vec<&str> = "lionXXtigerXleopard".split('X').collect();
assert_eq!(v, ["lion", "", "tiger", "leopard"]);

// 返回匹配 pattern 的子字符串迭代器
pub fn matches<'a, P>(&'a self, pat: P) -> Matches<'a, P> where    P: Pattern<'a>,
pub fn rmatches<'a, P>(&'a self, pat: P) -> RMatches<'a, P> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,
pub fn match_indices<'a, P>(&'a self, pat: P) -> MatchIndices<'a, P> where P: Pattern<'a>,
pub fn rmatch_indices<'a, P>(&'a self, pat: P) -> RMatchIndices<'a, P> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,

let v: Vec<&str> = "abcXXXabcYYYabc".matches("abc").collect();
assert_eq!(v, ["abc", "abc", "abc"]);
let v: Vec<&str> = "1abc2abc3".matches(char::is_numeric).collect();
assert_eq!(v, ["1", "2", "3"]);

// 删除(执行多次)start/end 两端的空格或两端匹配的 pattern
pub fn trim(&self) -> &str
pub fn trim_start(&self) -> &str
pub fn trim_end(&self) -> &str
pub fn trim_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: DoubleEndedSearcher<'a>,
pub fn trim_start_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>,
pub fn trim_end_matches<'a, P>(&'a self, pat: P) -> &'a str where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,

// 删除开头的前缀,不像 trim_start_matches 那样删除多次,而是最多删除一次
pub fn strip_prefix<'a, P>(&'a self, prefix: P) -> Option<&'a str> where P: Pattern<'a>,
pub fn strip_suffix<'a, P>(&'a self, suffix: P) -> Option<&'a str> where P: Pattern<'a>, <P as Pattern<'a>>::Searcher: ReverseSearcher<'a>,

// 将字符串转换为其他类型 F,F 类型需要实现 FromStr trait。Rust 的基本类型都实现了该 trait。
pub fn parse<F>(&self) -> Result<F, <F as FromStr>::Err> where F: FromStr,

pub const fn is_ascii(&self) -> bool
pub const fn as_ascii(&self) -> Option<&[AsciiChar]>
pub fn eq_ignore_ascii_case(&self, other: &str) -> bool
pub fn make_ascii_uppercase(&mut self)
pub fn make_ascii_lowercase(&mut self)
pub const fn trim_ascii_start(&self) -> &str
pub const fn trim_ascii_end(&self) -> &str
pub const fn trim_ascii(&self) -> &str
pub fn escape_debug(&self) -> EscapeDebug<'_>
pub fn escape_default(&self) -> EscapeDefault<'_>
pub fn escape_unicode(&self) -> EscapeUnicode<'_>


impl str
pub fn into_boxed_bytes(self: Box<str>) -> Box<[u8]>
// 替换
pub fn replace<'a, P>(&'a self, from: P, to: &str) -> String where P: Pattern<'a>,
pub fn replacen<'a, P>(&'a self, pat: P, to: &str, count: usize) -> String where P: Pattern<'a>,
pub fn to_lowercase(&self) -> String
pub fn to_uppercase(&self) -> String
// &str 转为 String
pub fn into_string(self: Box<str>) -> String
pub fn repeat(&self, n: usize) -> String
pub fn to_ascii_uppercase(&self) -> String
pub fn to_ascii_lowercase(&self) -> String

find/match/trim() 方法的 Pattern 参数类型:

Pattern type Match condition
&str is substring
char is contained in string
&[char] any char in slice is contained in string
F: FnMut(char) -> bool F returns true for a char in string
&&str is substring
&String is substring
// &str
assert_eq!("abaaa".find("ba"), Some(1));
assert_eq!("abaaa".find("bac"), None);

// char
assert_eq!("abaaa".find('a'), Some(0));
assert_eq!("abaaa".find('b'), Some(1));
assert_eq!("abaaa".find('c'), None);

// &[char; N]
assert_eq!("ab".find(&['b', 'a']), Some(0));
assert_eq!("abaaa".find(&['a', 'z']), Some(0));
assert_eq!("abaaa".find(&['c', 'd']), None);

// &[char]
assert_eq!("ab".find(&['b', 'a'][..]), Some(0));
assert_eq!("abaaa".find(&['a', 'z'][..]), Some(0));
assert_eq!("abaaa".find(&['c', 'd'][..]), None);

// FnMut(char) -> bool
assert_eq!("abcdef_z".find(|ch| ch > 'd' && ch < 'y'), Some(4));
assert_eq!("abcddd_z".find(|ch| ch > 'd' && ch < 'y'), None);

FromStr trait: 从 &str 来生成各种类型的值,Rust 基本类型,如整数、浮点数、bool、char、String、 PathBuf、IpAddr、SocketAddr、Ipv4Addr、Ipv6Addr 都实现了该 trait。

  • 被泛型方法 &str.parse::<T>() 方法隐式调用。
  • 使用 &str.parse() 方法时一般需要指定目标对象类型,否则编译器可能不知道该调用那个类型的 FromStr trait 实现而报错:
pub trait FromStr: Sized {
    type Err;

    // Required method
    fn from_str(s: &str) -> Result<Self, Self::Err>;
}

pub fn parse<F>(&self) -> Result<F, <F as FromStr>::Err> where F: FromStr,

let four: u32 = "4".parse().unwrap();
assert_eq!(4, four);

let four = "4".parse::<u32>();
assert_eq!(Ok(4), four);

// Error
let nope = "j".parse::<u32>();
assert!(nope.is_err());

5.2 String
#

&str 和 String 间转换:

  1. String -> &str: String.as_str();
  2. String::from(“Sunfei”) 或 “Sunface”.to_string()

创建 String:

  1. &str.to_string()
  2. &str.to_owned()
  3. format!()
  4. Array/slice/Vec 的 .concat() 和 .join()
  5. String::from()/String::from_utf8()
let error_message = "too many pets".to_string();
assert_eq!(format!("{}°{:02}{:02}′′N", 24, 5, 23), "24°05′23′′N".to_string());

let bits = vec!["veni", "vidi", "vici"];
assert_eq!(bits.concat(), "venividivici");
assert_eq!(bits.join(", "), "veni, vidi, vici");

let hello = String::from("Hello, world!");
let mut hello = String::from("Hello, ");
// push 字符
hello.push('w');
// push 字面量
hello.push_str("orld!");

// some bytes, in a vector
let sparkle_heart = vec![240, 159, 146, 150];
// We know these bytes are valid, so we'll use `unwrap()`.
let sparkle_heart = String::from_utf8(sparkle_heart).unwrap();
assert_eq!("💖", sparkle_heart);

let s = "hello";
let third_character = s.chars().nth(2); // charts() 返回 char 类型(固定 4 bytes Unicode 码点)
assert_eq!(third_character, Some('l'));

let noodles = "noodles".to_string();
let oodles = &noodles[1..]; // String 的 slice 操作返回 str

String 类型实现了 Defref<Target = str>, 所以:

  1. String 类型可以使用 str 定义的所有方法;
  2. 在需要 &str 类型的地方可以传入 &String;

String 可以 +/+= &str, 但是不支持 &str 之间的 +/+= 以及 &str + String 的操作:

let mut ss = String::from("abcd");
ss += " def"; // OK: String + &str

// " def" + ss; // `+` cannot be used to concatenate a `&str` with a `String`
" def".to_owned() + &ss;   // OK

let s1 = String::from("hello,");
let s2 = String::from("world!");
let s3 = s1 + &s2;   // let s3 = s1.clone() +&s2;
assert_eq!(s3, "hello,world!");
//println!("{}", s1); // s1 已经在上面的 + 操作被 move, 导致继续使用 s1 出错。

String 的底层表示是 Vec<u8>, 所以它的栈内存布局包括三部分,可以使用 as_ptr()/len()/capacity() 来获取他们的值:

  1. 指向堆连续内存的地址;
  2. 内存 byte 长度;
  3. 内存 byte 容量;
use std::mem;

let story = String::from("Once upon a time...");

// Prevent automatically dropping the String's data
let mut story = mem::ManuallyDrop::new(story);

let ptr = story.as_mut_ptr();
let len = story.len();
let capacity = story.capacity();
assert_eq!(19, len);

// We can re-build a String out of ptr, len, and capacity. This is all unsafe because we are
// responsible for making sure the components are valid:
let s = unsafe { String::from_raw_parts(ptr, len, capacity) } ;

assert_eq!(String::from("Once upon a time..."), s);

5.3 String 方法
#

  1. new()/with_capacity()
  2. len()/capacity()/is_empty()
  3. from_utf8_XX()
  4. into_raw_parts()/form_raw_parts()
  5. into_bytes()/into_boxed_str()/as_bytes()/as_str()/as_mut_str()
  6. push()/push_str()
  7. reserve()/shrink_to()/truncate()/clear()
  8. pop()/remove()/retaion()/insert()/insert_str()/drain()/clear()
impl String

// 空 String
pub const fn new() -> String
// 指定初始容量的空 String
pub fn with_capacity(capacity: usize) -> String

// 从 Vec<u8> 创建 String
pub fn from_utf8(vec: Vec<u8>) -> Result<String, FromUtf8Error>
pub unsafe fn from_utf8_unchecked(bytes: Vec<u8>) -> String

// 从 &[u8] 创建 String
pub fn from_utf8_lossy(v: &[u8]) -> Cow<'_, str>
pub fn from_utf16(v: &[u16]) -> Result<String, FromUtf16Error>
pub fn from_utf16_lossy(v: &[u16]) -> String
pub fn from_utf16le(v: &[u8]) -> Result<String, FromUtf16Error>
pub fn from_utf16le_lossy(v: &[u8]) -> String
pub fn from_utf16be(v: &[u8]) -> Result<String, FromUtf16Error>
pub fn from_utf16be_lossy(v: &[u8]) -> String

// raw pointer 互操作
pub fn into_raw_parts(self) -> (*mut u8, usize, usize)
pub unsafe fn from_raw_parts(buf: *mut u8, length: usize, capacity: usize ) -> String

// 转换为 Vec<u8>, &[u8], &str
pub fn into_bytes(self) -> Vec<u8>
pub unsafe fn as_mut_vec(&mut self) -> &mut Vec<u8>
pub fn into_boxed_str(self) -> Box<str>
pub fn as_bytes(&self) -> &[u8]
pub fn as_str(&self) -> &str
pub fn as_mut_str(&mut self) -> &mut str

// 添加或指定位置插入 char 或 &str
pub fn push(&mut self, ch: char)
pub fn push_str(&mut self, string: &str)
pub fn extend_from_within<R>(&mut self, src: R) where R: RangeBounds<usize>,
pub fn insert(&mut self, idx: usize, ch: char)
pub fn insert_str(&mut self, idx: usize, string: &str)

// 返回容量和长度
pub fn capacity(&self) -> usize
pub fn len(&self) -> usize
pub fn is_empty(&self) -> bool

// 修改长度
pub fn reserve(&mut self, additional: usize)
pub fn reserve_exact(&mut self, additional: usize)
pub fn try_reserve(&mut self, additional: usize) -> Result<(), TryReserveError>
pub fn try_reserve_exact( &mut self,    additional: usize) -> Result<(), TryReserveError>
pub fn shrink_to_fit(&mut self)
pub fn shrink_to(&mut self, min_capacity: usize)
pub fn truncate(&mut self, new_len: usize)
pub fn clear(&mut self)

// 删除某个字符 char
pub fn pop(&mut self) -> Option<char>
pub fn remove(&mut self, idx: usize) -> char
pub fn remove_matches<P, 'a>(&'a mut self, pat: P) where P: for<'x> Pattern<'x>,

// 只保留 f 返回 true 的字符
pub fn retain<F>(&mut self, f: F) where F: FnMut(char) -> bool,
let mut s = String::from("f_o_ob_ar");
s.retain(|c| c != '_');
assert_eq!(s, "foobar");

pub fn split_off(&mut self, at: usize) -> String

// 删除指定范围的字符,返回删除字符串的迭代器
pub fn drain<R>(&mut self, range: R) -> Drain<'_> where R: RangeBounds<usize>,
let mut s = String::from("α is alpha, β is beta");
let beta_offset = s.find('β').unwrap_or(s.len());
// Remove the range up until the β from the string
let t: String = s.drain(..beta_offset).collect();
assert_eq!(t, "α is alpha, ");
assert_eq!(s, "β is beta");
// A full range clears the string, like `clear()` does
s.drain(..);
assert_eq!(s, "");

pub fn replace_range<R>(&mut self, range: R, replace_with: &str) where R: RangeBounds<usize>,
pub fn leak<'a>(self) -> &'a mut str

在 push 或 insert 时,String 自动调整容量:

let mut s = String::new();
println!("{}", s.capacity());
for _ in 0..5 {
    s.push_str("hello");
    println!("{}", s.capacity());
}

// 如果一次分配好容量,则后续可能不会自动临时调大
let mut s = String::with_capacity(25);
println!("{}", s.capacity());
for _ in 0..5 {
    s.push_str("hello");
    println!("{}", s.capacity());
}

5.4 [u8] 方法
#

String 和 &str 的 as_bytes() 方法返回 &[u8].

b"xxx" 的类型是 &[u8; N],可以自动被 unsized ceerce 到 &[u8]:

impl [u8]

// 检查 [u8] 各元素是否是 ascii
pub const fn is_ascii(&self) -> bool

pub const fn as_ascii(&self) -> Option<&[AsciiChar]>
pub const unsafe fn as_ascii_unchecked(&self) -> &[AsciiChar]
pub fn eq_ignore_ascii_case(&self, other: &[u8]) -> bool
pub fn make_ascii_uppercase(&mut self)
pub fn make_ascii_lowercase(&mut self)
pub fn escape_ascii(&self) -> EscapeAscii<'_>
let s = b"0\t\r\n'\"\\\x9d";
let escaped = s.escape_ascii().to_string();
assert_eq!(escaped, "0\\t\\r\\n\\'\\\"\\\\\\x9d");

pub const fn trim_ascii_start(&self) -> &[u8]
#![feature(byte_slice_trim_ascii)]
assert_eq!(b" \t hello world\n".trim_ascii_start(), b"hello world\n");
assert_eq!(b"  ".trim_ascii_start(), b"");
assert_eq!(b"".trim_ascii_start(), b"");

pub const fn trim_ascii_end(&self) -> &[u8]
pub const fn trim_ascii(&self) -> &[u8]

[u8] 的 as_ascii() 返回 [AsciiChar] 类型:

impl [AsciiChar]
pub const fn as_str(&self) -> &str
pub const fn as_bytes(&self) -> &[u8]

6 array
#

array 是同类型元素的和固定长度的,在栈上分配的连续内存空间,用 [T; N] 表示,N 必须是编译时常量:

fn init_arr(n: i32) {
    let arr = [1; n]; // 错误, n 不是编译时常量.
}

创建 array:

// 声明一个有 5 个 i32 整数的数组
let numbers: [i32; 5] = [1, 2, 3, 4, 5];

// 声明一个有 5 个元素都是 0 的数组. 表达式右侧 [Value; N] 的 Value 必须实现 Copy
let zeroes: [i32; 5] = [0; 5];

let mut values: [i32; 3] = [10, 20, 30];
values[1] = 25;
println!("values: {:?}", values);
println!("The array has {} elements.", values.len());

Rust 数组和集合的元素索引都从 0 开始, 必须 < len(), 否则会 panic,但是可以通过 get(i) 返回的 Option<&T> 来判断 index 对应的元素是否存在。

array 的 slice 操作 a[start..end] 返回一个 dynamic size 的 slice 类型 [T],故一般使用 &[T] 或 Box<[T]>:

  • slice 操作返回的 &a[start..ennd] 不需要拷贝堆内存, 它们不拥有任何数据,而只是借用数组或其他集合中的数据。
let arr = [1, 2, 3, 4, 5];

// 创建一个包含整个数组的 slice
let slice_whole = &arr[..];
// 创建一个包含数组中一部分元素的 slice
let slice_part = &arr[1..4];

let a = [1, 2, 3, 4, 5];
// a[1..3] 返回的类型为 [i32], &a[1..3] 返回的类型为 &[i32]
let slice = &a[1..3];
// &[i32] 可以直接和 &[i32; 2] 类型比较
assert_eq!(slice, &[2, 3]);

// 一个接受切片作为参数的函数
fn sum(slice: &[i32]) -> i32 {
    let mut total = 0;
    for i in slice {
        total += i;
    }
    total
}
fn main() {
    let arr = [1, 2, 3, 4, 5];
    let result = sum(&arr[1..4]); // 只计算数组一部分的和
    println!("The sum of the part of the array is: {}", result);
}

[T] 和 array 的相互转换:[T] 是 dynamic size, 不能反向 coerce 到 array, 但是可以使用 slice.try_into().unwrap() 或 <ArrayType>::try_from(slice).unwrap() 来在相同长度的 slice 和 array 之间转换:

let bytes: [u8; 3] = [1, 0, 2];
// &bytes[0..2] 返回 slice
// <[u8; 2]>::try_from(&bytes[0..2]) 是从 slice 生成 array
assert_eq!(1, u16::from_le_bytes(<[u8; 2]>::try_from(&bytes[0..2]).unwrap()));

// bytes[1..3] 返回 slice, 用来生成 array
assert_eq!(512, u16::from_le_bytes(bytes[1..3].try_into().unwrap()));

let mut bytes: [u8; 3] = [1, 0, 2];
let bytes_head: [u8; 2] = <[u8; 2]>::try_from(&mut bytes[0..2]).unwrap();
assert_eq!(1, u16::from_le_bytes(bytes_head));
let bytes_tail: [u8; 2] = (&mut bytes[1..3]).try_into().unwrap();
assert_eq!(512, u16::from_le_bytes(bytes_tail));

array 支持 for-in 迭代,结果为数组元素 T:

  • slice 操作 &a[m..n], 结果为切片引用 &[T],它也支持迭代,但迭代结果为 &T;
fn main() {
    let mut numbers: [i32; 5] = [1, 2, 3, 4, 5];
    for number in numbers { // numbers.iter()/numbers.iter_mut()/numbers.into_iter()
        println!("number: {}", number);
    }
}

Rust 不允许 Array/Vec/HashMap/HashSet 中的元素被 partial move 出来(全部 move 出来是 OK 的),所以如果 array 元素不支持 Copy,则 index 操作后再赋值转移会失败:

  • 但是允许 struct/tuple/union 中的 field 被部分 move 出来。
  • 解决办法是使用 std::mem::replace() 来用其同类型对象来替换:
  • Slice patterns can match both arrays of fixed size and slices of dynamic size.
fn move_away(_: String) { /* Do interesting things. */ }
// 全部 move 出来,OK!
let [john, roa] = ["John".to_string(), "Roa".to_string()];
move_away(john);
move_away(roa);

// 有问题代码:
struct Buffer<T> { buf: Vec<T> }
impl<T> Buffer<T> {
    fn replace_index(&mut self, i: usize, v: T) -> T {
        // error: cannot move out of dereference of `&mut`-pointer
        let t = self.buf[i]; // 失败
        self.buf[i] = v;
        t
    }
}

// std::mem::replace 对 &mut 对象替换, 返回替换前的对象
use std::mem;
impl<T> Buffer<T> {
    fn replace_index(&mut self, i: usize, v: T) -> T {
        mem::replace(&mut self.buf[i], v)
    }
}
let mut buffer = Buffer { buf: vec![0, 1] };
assert_eq!(buffer.buf[0], 0);
assert_eq!(buffer.replace_index(0, 2), 0);
assert_eq!(buffer.buf[0], 2);

array 没有实现 Display,但是如果元素类型实现了 Debug 则数组也实现 Debug:

println!("numbers: {:?}", numbers);
println!("zeroes: {:?}", zeroes);

如果 array 元素类型实现了如下 trait,则 array 也实现了对应 trait:

  • Copy,Clone
  • Debug( array 没有实现 Display)
  • IntoIterator (implemented for [T; N], &[T; N] and &mut [T; N])
  • PartialEq, PartialOrd, Eq, Ord
  • Hash
  • AsRef, AsMut
  • Borrow, BorrowMut

array [T; N] 可以被 type coerce 到 slice 类型 [T]:

  • &[T; N ] 可以被隐式自动转换为 &[T],所以 array 可以调用 slice 的方法
  • array 并没有实现 Deref trait,所以上面的自动转换不是 Deref 的行为;
// 左边是类型, 右边是初始化表达式!
let mut array: [i32; 3] = [0; 3];

// coercing an array to a slice
let str_slice: &[&str] = &["one", "two", "three"];

// numbers 是 &[i32; 3] 类型,函数传参时被自动转换为 &[i32] 类型
let numbers = &[0, 1, 2];
print_type_of(&numbers);

// 数组 [i32; 3] 可以被 type coerce 到 [T], 所以 &[i32; 3] 可以被赋值给 &[i32]
let numbers: &[i32] = &[0, 1, 2];
print_type_of(&numbers);

// number 虽然前面没有加 &, 但是它本身是 &[i32] 类型, 所以迭代后元素 n 是 &32 类型.
for n in numbers {
    print_type_of(&n);  // n 是 &32 类型
}
fn print_type_of<T>(v: &T) -> String {
    format!("{}", std::any::type_name_of_val(v))
}

// i32,切片引用支持 index 操作,返回元素本身, 必须实现 Copy, 否则报错。
print_type_of(&numbers[0]);

arrary 类型 [T; N] 可以 type coerce 到 [T], 进而 type coerce 到 Box<[T]>:

// A heap-allocated array, coerced to a slice
let boxed_array: Box<[i32]> = Box::new([1, 2, 3]);

7 slice
#

slice 代表一块连续的内存区域,用 [T] 表示,它是编译时大小未知的类型。作为变量/函数输入/输出参数类型来使用时, 一般使用具体固定大小的 &[T] 或 Box<[T]> 类型:

  • 虽然编译时大小未知,但是 .len() 方法返回 slice 的元素数量;
  • &[T] 固定大小为 2 usize 的 fat pointer,包含指向内存区域的指针和元素数量;
let pointer_size = std::mem::size_of::<&u8>();
assert_eq!(2 * pointer_size, std::mem::size_of::<&[u8]>());
assert_eq!(2 * pointer_size, std::mem::size_of::<*const [u8]>());
assert_eq!(2 * pointer_size, std::mem::size_of::<Box<[u8]>>());
assert_eq!(2 * pointer_size, std::mem::size_of::<Rc<[u8]>>());

创建 slice &[T]:

  • 对 array/Vec/String/&str 的 range index 操作返回 [T], 如 &v[0..2],&v[1..],&v[..] 等;
  • Vec[T] 实现了 Deref<Target=[T]>,所以 &Vec<T> 可以被隐式转换为 &[T],在需要 &[T] 类型的地方可以传入 &Vec<T> 类型,Vec 对象也可以调用 slice [T] 的方法;
    • &vec 返回 &Vec<i32> 类型,而 &vec[n..m] 返回 &[i32];
  • array [T; N] 可以被 type coercing 到 [T], 所以 &[T; N] 可以被隐式转换为 &[T],这样 array 对象也可以调用 slice [T] 的方法;
// slicing a Vec
let vec = vec![1, 2, 3];
let int_slice = &vec[..];   // &vec 返回的是 &Vec<i32> 类型而非 &[i32]
let int_slice: &[i32] = &vec; // 由于 Vec[T] 实现了 Deref<Target=[T]>,所以 &Vec<i32> 可以被转换为 &[i32] 类型

// coercing an array to a slice
let str_slice: &[&str] = &["one", "two", "three"];

let mut x = [1, 2, 3];
let x = &mut x[..]; // Take a full slice of `x`.
x[1] = 7;
assert_eq!(x, &[1, 7, 3]);

// 由于数组 [i32; 3] 可以被 coerce 到 unsize 的 [T], 所以 &[i32; 3] 可以被赋值给 &[i32]
let numbers: &[i32] = &[0, 1, 2];
print_type_of(&numbers); // &[i32],数组引用类型可以被自动转换为切片引用类型
for n in numbers {
    print_type_of(&n);  // &i32,迭代切片引用,返回元素的引用
}
print_type_of(&numbers[0]); // i32,切片引用的支持 index 操作,返回元素本身

fn read_slice(slice: &[usize]) {
    // ...
}
let v = vec![0, 1];
read_slice(&v); // Deref 自动转换

let u: &[usize] = &v; // Deref 自动转换
// or like this:
let u: &[_] = &v;

// 其他例子
#![allow(unused)]
fn print_type_of<T>(_: &T) {
    println!("{}", std::any::type_name::<T>())
}
fn main() {
    let x = [1_u32, 2, 3]; // [u32;3],数组类型
    let x2 = &x; // &[u32; 3] ,数组引用类型
    let x3 = &x[..]; // &[u32],切片引用类型
    let x4 = &x[1..]; // &[u32],切片引用类型

    let y = vec![1_u32, 2, 3]; // Vec<u32>,向量类型
    let y2 = &y; // &Vec<u32>,向量引用
    let y3 = &y[..]; // &[u32],切片引用
    // u32,切片引用的支持 index 操作,返回元素本身
    y3[1];
    print_type_of(&y3[1]); // u32


    let numbers = &[0, 1, 2];
    print_type_of(&numbers); // &[i32; 3],数组引用类型

    let numbers: &[i32] = &[0, 1, 2];
    print_type_of(&numbers); // &[i32],数组引用类型可以被自动转换为切片引用类型
    for n in numbers {
        print_type_of(&n);  // &i32,迭代切片引用,返回元素的引用
    }
    print_type_of(&numbers[0]); // i32,切片引用的支持 index 操作,返回元素本身
}

slice.to_vec() 方法将 slice 内容 clone 到一个新的 Vec 中.

s[i] 返回的 s 的元素值,而非它的引用,所以支持将 x[i] 作为左值:

let mut x = [1, 2, 3];
let x = &mut x[..]; // Take a full slice of `x`.
x[1] = 7; // x[1] 的类型是 mut i32, 所以可以进行修改.

for-in 迭代 &[T] 时返回 &T 元素:

let numbers: &[i32] = &[0, 1, 2];  // &[0, 1, 2] 的类型是 &[i32; 3] 被 rust 自动转换为 &[i32]
for n in numbers { // n 是 &i32 类型
    println!("{n} is a number!");
}

let mut scores: &mut [i32] = &mut [7, 8, 9];
for score in scores { // score 是 &mut i32 类型.
    *score += 1;
}

对 array/slice 进行 index 操作时,如果超过了 length,则会 panic。解决办法是使用安全的 .get() 方法,它返回一个 Option,get() 方法的参数是 SliceIndex<[T]>,Range<usize>/RangeFull/RangeFrom<usize> 等均实现了该 trait:

// Arrays can be safely accessed using `.get`, which returns an `Option`. This can be matched as
// shown below, or used with `.expect()` if you would like the program to exit with a nice message
// instead of happily continue.
for i in 0..xs.len() + 1 { // Oops, one element too far!
    match xs.get(i) {
        Some(xval) => println!("{}: {}", i, xval),
        None => println!("Slow down! {} is too far!", i),
    }
}

let v = [10, 40, 30];
assert_eq!(Some(&40), v.get(1));
assert_eq!(Some(&[10, 40][..]), v.get(0..2));
assert_eq!(None, v.get(3));
assert_eq!(None, v.get(0..4));

数组 slice 的 flatten:

impl<T, const N: usize> [[T; N]]
pub const fn flatten(&self) -> &[T]

#![feature(slice_flatten)]
assert_eq!([[1, 2, 3], [4, 5, 6]].flatten(), &[1, 2, 3, 4, 5, 6]);
assert_eq!(
    [[1, 2, 3], [4, 5, 6]].flatten(),
    [[1, 2], [3, 4], [5, 6]].flatten(),
);
let slice_of_empty_arrays: &[[i32; 0]] = &[[], [], [], [], []];
assert!(slice_of_empty_arrays.flatten().is_empty());
let empty_slice_of_arrays: &[[u32; 10]] = &[];
assert!(empty_slice_of_arrays.flatten().is_empty());

slice [T] 方法:由于 array/Vec 可以被 type coerse 到 [T], 所以 array/Vec 也可以调用 slice 的方法。

  • 不能直接迭代 slice, 而是调用它的 iter() 或 iter_mut() 方法返回的迭代器;
impl<T> [T]

// 返回元素数量
pub const fn len(&self) -> usize
pub const fn is_empty(&self) -> bool

// slice 有可能为空,所以 first/last 都返回 Option
pub const fn first(&self) -> Option<&T>
pub fn first_mut(&mut self) -> Option<&mut T>
pub const fn last(&self) -> Option<&T>
pub fn last_mut(&mut self) -> Option<&mut T>

// 拆分 slice
pub const fn split_first(&self) -> Option<(&T, &[T])>
pub fn split_first_mut(&mut self) -> Option<(&mut T, &mut [T])>
pub const fn split_last(&self) -> Option<(&T, &[T])>
pub fn split_last_mut(&mut self) -> Option<(&mut T, &mut [T])>
// x 是 &[i32; 3] 类型,但是可以被 type coerce 到 &[i32] 类型,所以可以调用 slice [T] 的方法。
let x = &[0, 1, 2];
if let Some((first, elements)) = x.split_first() {
    assert_eq!(first, &0);
    assert_eq!(elements, &[1, 2]);
}

// 返回第一个 N 个元素的数组,如果元素少于 N 则返回 None
//
// 由于数组长度必须是编译时常量,所以 N 是通过常量泛型参数传入的。
pub const fn first_chunk<const N: usize>(&self) -> Option<&[T; N]>
pub fn first_chunk_mut<const N: usize>(&mut self) -> Option<&mut [T; N]>
pub fn last_chunk<const N: usize>(&self) -> Option<&[T; N]>
pub fn last_chunk_mut<const N: usize>(&mut self) -> Option<&mut [T; N]>
let u = [10, 40, 30];
assert_eq!(Some(&[10, 40]), u.first_chunk::<2>());  // 2 是泛型常量,使用类似于泛型函数的比目鱼语法
let v: &[i32] = &[10];
assert_eq!(None, v.first_chunk::<2>());
let w: &[i32] = &[];
assert_eq!(Some(&[]), w.first_chunk::<0>());

// 返回第一个或最后一个 chunk 数组和剩下的 slice,如果元素少于 N 则返回 None
pub const fn split_first_chunk<const N: usize>(&self) -> Option<(&[T; N], &[T])>
pub fn split_first_chunk_mut<const N: usize>( &mut self ) -> Option<(&mut [T; N], &mut [T])>
pub const fn split_last_chunk<const N: usize>(&self) -> Option<(&[T], &[T; N])>
pub fn split_last_chunk_mut<const N: usize>( &mut self ) -> Option<(&mut [T], &mut [T; N])>
let x = &[0, 1, 2];
if let Some((first, elements)) = x.split_first_chunk::<2>() {
    assert_eq!(first, &[0, 1]);
    assert_eq!(elements, &[2]);
}
assert_eq!(None, x.split_first_chunk::<4>());

// 安全的返回 slice 中元素(s[index] 当 index 不在范围时会 panic )
pub fn get<I>(&self, index: I) -> Option<&<I as SliceIndex<[T]>>::Output> where I: SliceIndex<[T]>
pub fn get_mut<I>( &mut self, index: I ) -> Option<&mut <I as SliceIndex<[T]>>::Output> where I: SliceIndex<[T]>
pub unsafe fn get_unchecked<I>( &self, index: I ) -> &<I as SliceIndex<[T]>>::Output where I: SliceIndex<[T]>
pub unsafe fn get_unchecked_mut<I>( &mut self, index: I ) -> &mut <I as SliceIndex<[T]>>::Output where I: SliceIndex<[T]>
let v = [10, 40, 30];
assert_eq!(Some(&40), v.get(1));
assert_eq!(Some(&[10, 40][..]), v.get(0..2));
assert_eq!(None, v.get(3));
assert_eq!(None, v.get(0..4));

// 创建裸指针
pub const fn as_ptr(&self) -> *const T
pub const fn as_mut_ptr(&mut self) -> *mut T
let x = &[1, 2, 4];
let x_ptr = x.as_ptr();
unsafe {
    for i in 0..x.len() {
        assert_eq!(x.get_unchecked(i), &*x_ptr.add(i));
    }
}
let x = &mut [1, 2, 4];
let x_ptr = x.as_mut_ptr();
unsafe {
    for i in 0..x.len() {
        *x_ptr.add(i) += 2;
    }
}
assert_eq!(x, &[3, 4, 6]);

// 返回包含所有元素的原始指针的区间(因为 slice 内存空间连续)
pub const fn as_ptr_range(&self) -> Range<*const T>
pub const fn as_mut_ptr_range(&mut self) -> Range<*mut T>
let a = [1, 2, 3];
let x = &a[1] as *const _;
let y = &5 as *const _;
assert!(a.as_ptr_range().contains(&x));
assert!(!a.as_ptr_range().contains(&y));

// 交换两个位置的值
pub fn swap(&mut self, a: usize, b: usize)
pub unsafe fn swap_unchecked(&mut self, a: usize, b: usize)
let mut v = ["a", "b", "c", "d", "e"];
v.swap(2, 4);
assert!(v == ["a", "b", "e", "d", "c"]);

// 反转 slice 元素
pub fn reverse(&mut self)

// 返回可迭代对象
pub fn iter(&self) -> Iter<'_, T>
pub fn iter_mut(&mut self) -> IterMut<'_, T>

// 可重叠,如果元素数量比窗口小,则返回 None
pub fn windows(&self, size: usize) -> Windows<'_, T>
let slice = ['l', 'o', 'r', 'e', 'm'];
let mut iter = slice.windows(3);
assert_eq!(iter.next().unwrap(), &['l', 'o', 'r']);
assert_eq!(iter.next().unwrap(), &['o', 'r', 'e']);
assert_eq!(iter.next().unwrap(), &['r', 'e', 'm']);
assert!(iter.next().is_none());
let slice = ['f', 'o', 'o'];
let mut iter = slice.windows(4);
assert!(iter.next().is_none());

// 不重叠的分组迭代,每次迭代返回一个切片 &[T]
pub fn chunks(&self, chunk_size: usize) -> Chunks<'_, T>
pub fn chunks_mut(&mut self, chunk_size: usize) -> ChunksMut<'_, T>
pub fn chunks_exact(&self, chunk_size: usize) -> ChunksExact<'_, T>
pub fn chunks_exact_mut(&mut self, chunk_size: usize) -> ChunksExactMut<'_, T>
pub const unsafe fn as_chunks_unchecked<const N: usize>(&self) -> &[[T; N]]
pub fn rchunks(&self, chunk_size: usize) -> RChunks<'_, T>
pub fn rchunks_mut(&mut self, chunk_size: usize) -> RChunksMut<'_, T>
pub fn rchunks_exact(&self, chunk_size: usize) -> RChunksExact<'_, T>
pub fn rchunks_exact_mut(&mut self, chunk_size: usize) -> RChunksExactMut<'_, T>
let slice = ['l', 'o', 'r', 'e', 'm'];
let mut iter = slice.chunks(2);
assert_eq!(iter.next().unwrap(), &['l', 'o']);
assert_eq!(iter.next().unwrap(), &['r', 'e']);
assert_eq!(iter.next().unwrap(), &['m']);
assert!(iter.next().is_none());
let slice = ['l', 'o', 'r', 'e', 'm'];
let mut iter = slice.chunks_exact(2);
assert_eq!(iter.next().unwrap(), &['l', 'o']);
assert_eq!(iter.next().unwrap(), &['r', 'e']);
// 如果最后一波元素少与数量,则返回 None,可以使用 remainer() 方法来获取它们
assert!(iter.next().is_none());
assert_eq!(iter.remainder(), &['m']);

// 分为 N 个元素数组的 slice 和最后剩下的元素 slice
pub const fn as_chunks<const N: usize>(&self) -> (&[[T; N]], &[T])
pub const fn as_rchunks<const N: usize>(&self) -> (&[T], &[[T; N]])
pub const unsafe fn as_chunks_unchecked_mut<const N: usize>( &mut self ) -> &mut [[T; N]]
pub const fn as_chunks_mut<const N: usize>( &mut self ) -> (&mut [[T; N]], &mut [T])
pub const fn as_rchunks_mut<const N: usize>( &mut self) -> (&mut [T], &mut [[T; N]])
#![feature(slice_as_chunks)]
let slice = ['l', 'o', 'r', 'e', 'm'];
let (chunks, remainder) = slice.as_chunks();
assert_eq!(chunks, &[['l', 'o'], ['r', 'e']]);
assert_eq!(remainder, &['m']);
#![feature(slice_as_chunks)]
let slice = ['R', 'u', 's', 't'];
let (chunks, []) = slice.as_chunks::<2>() else { // 使用 let-else 来匹配剩下元素的列表
    panic!("slice didn't have even length")
};
assert_eq!(chunks, &[['R', 'u'], ['s', 't']]);

// chunks_exact 的泛型常量版本,即数组的长度是通过泛型常量参数来指定的
pub fn array_chunks<const N: usize>(&self) -> ArrayChunks<'_, T, N>
pub fn array_chunks_mut<const N: usize>(&mut self) -> ArrayChunksMut<'_, T, N>
pub fn array_windows<const N: usize>(&self) -> ArrayWindows<'_, T, N>
#![feature(array_chunks)]
let slice = ['l', 'o', 'r', 'e', 'm'];
let mut iter = slice.array_chunks();
assert_eq!(iter.next().unwrap(), &['l', 'o']);
assert_eq!(iter.next().unwrap(), &['r', 'e']);
assert!(iter.next().is_none());
assert_eq!(iter.remainder(), &['m']);

//使用 pred 来分割 slice(不重合的分割),pred 返回 true 时对应连续的元素属于一个 slice
pub fn chunk_by<F>(&self, pred: F) -> ChunkBy<'_, T, F> where F: FnMut(&T, &T) -> bool
pub fn chunk_by_mut<F>(&mut self, pred: F) -> ChunkByMut<'_, T, F> where F: FnMut(&T, &T) -> pub
let slice = &[1, 1, 1, 3, 3, 2, 2, 2];
let mut iter = slice.chunk_by(|a, b| a == b);
assert_eq!(iter.next(), Some(&[1, 1, 1][..]));
assert_eq!(iter.next(), Some(&[3, 3][..]));
assert_eq!(iter.next(), Some(&[2, 2, 2][..]));
assert_eq!(iter.next(), None);

// 在指定的 index 位置拆分 slice
bool const fn split_at(&self, mid: usize) -> (&[T], &[T])
pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T])
pub const unsafe fn split_at_unchecked(&self, mid: usize) -> (&[T], &[T])
pub unsafe fn split_at_mut_unchecked( &mut self, mid: usize ) -> (&mut [T], &mut [T])
pub fn split_at_checked(&self, mid: usize) -> Option<(&[T], &[T])>
pub fn split_at_mut_checked( &mut self, mid: usize ) -> Option<(&mut [T], &mut [T])>
let v = [1, 2, 3, 4, 5, 6];
{
    let (left, right) = v.split_at(0);
    assert_eq!(left, []);
    assert_eq!(right, [1, 2, 3, 4, 5, 6]);
}
{
    let (left, right) = v.split_at(2);
    assert_eq!(left, [1, 2]);
    assert_eq!(right, [3, 4, 5, 6]);
}
{
    let (left, right) = v.split_at(6);
    assert_eq!(left, [1, 2, 3, 4, 5, 6]);
    assert_eq!(right, []);
}

//  使用指定的 pred 分割 slice,可能会导致空 slice
pub fn split<F>(&self, pred: F) -> Split<'_, T, F> where F: FnMut(&T) -> bool
pub fn split_mut<F>(&mut self, pred: F) -> SplitMut<'_, T, F> where F: FnMut(&T) -> bool
pub fn split_inclusive<F>(&self, pred: F) -> SplitInclusive<'_, T, F> where F: FnMut(&T) -> bool
pub fn split_inclusive_mut<F>(&mut self, pred: F) -> SplitInclusiveMut<'_, T, F> where F: FnMut(&T) -> bool
pub fn rsplit<F>(&self, pred: F) -> RSplit<'_, T, F> where F: FnMut(&T) -> bool
pub fn rsplit_mut<F>(&mut self, pred: F) -> RSplitMut<'_, T, F> where F: FnMut(&T) -> bool
pub fn splitn<F>(&self, n: usize, pred: F) -> SplitN<'_, T, F> where F: FnMut(&T) -> bool
pub fn splitn_mut<F>(&mut self, n: usize, pred: F) -> SplitNMut<'_, T, F> where    F: FnMut(&T) -> bool
pub fn rsplitn<F>(&self, n: usize, pred: F) -> RSplitN<'_, T, F> where    F: FnMut(&T) -> bool
pub fn rsplitn_mut<F>(&mut self, n: usize, pred: F) -> RSplitNMut<'_, T, F> where    F: FnMut(&T) -> bool
pub fn split_once<F>(&self, pred: F) -> Option<(&[T], &[T])> where    F: FnMut(&T) -> bool
pub fn rsplit_once<F>(&self, pred: F) -> Option<(&[T], &[T])> where    F: FnMut(&T) -> bool
let slice = [10, 6, 33, 20];
let mut iter = slice.split(|num| num % 3 == 0);
assert_eq!(iter.next().unwrap(), &[10]);
assert_eq!(iter.next().unwrap(), &[]);
assert_eq!(iter.next().unwrap(), &[20]);
assert!(iter.next().is_none());
let slice = [10, 40, 33];
let mut iter = slice.split(|num| num % 3 == 0);
assert_eq!(iter.next().unwrap(), &[10, 40]);
assert_eq!(iter.next().unwrap(), &[]); // 结尾空 slice
assert!(iter.next().is_none());
let slice = [10, 40, 33, 20];
let mut iter = slice.split_inclusive(|num| num % 3 == 0);
assert_eq!(iter.next().unwrap(), &[10, 40, 33]);
assert_eq!(iter.next().unwrap(), &[20]);
assert!(iter.next().is_none());
let v = [10, 40, 30, 20, 60, 50];
for group in v.splitn(2, |num| *num % 3 == 0) {
    println!("{group:?}");
}
#![feature(slice_split_once)]
let s = [1, 2, 3, 2, 4];
assert_eq!(s.split_once(|&x| x == 2), Some((
    &[1][..],
    &[3, 2, 4][..]
)));
assert_eq!(s.split_once(|&x| x == 0), None);

// 是否包含引用值
pub fn contains(&self, x: &T) -> bool where T: PartialEq
let v = [10, 40, 30];
assert!(v.contains(&30));
assert!(!v.contains(&50));

// 是否以指定 slice 开始或结尾
pub fn starts_with(&self, needle: &[T]) -> bool where T: PartialEq
pub fn ends_with(&self, needle: &[T]) -> bool where T: PartialEq
let v = [10, 40, 30];
assert!(v.starts_with(&[10]));
assert!(v.starts_with(&[10, 40]));
assert!(!v.starts_with(&[50]));
assert!(!v.starts_with(&[10, 50]));

// 删除开始或结尾的 slice
pub fn strip_prefix<P>(&self, prefix: &P) -> Option<&[T]> where P: SlicePattern<Item = T> + ?Sized, T: PartialEq
pub fn strip_suffix<P>(&self, suffix: &P) -> Option<&[T]> where P: SlicePattern<Item = T> + ?Sized, T: PartialEq
let v = &[10, 40, 30];
assert_eq!(v.strip_prefix(&[10]), Some(&[40, 30][..]));
assert_eq!(v.strip_prefix(&[10, 40]), Some(&[30][..]));
assert_eq!(v.strip_prefix(&[50]), None);
assert_eq!(v.strip_prefix(&[10, 50]), None);
let prefix : &str = "he";
assert_eq!(b"hello".strip_prefix(prefix.as_bytes()), Some(b"llo".as_ref()));

pub fn binary_search(&self, x: &T) -> Result<usize, usize> where T: Ord
pub fn binary_search_by<'a, F>(&'a self, f: F) -> Result<usize, usize> where F: FnMut(&'a T) -> Ordering
pub fn binary_search_by_key<'a, B, F>(&'a self, b: &B,f:F) -> Result<usize, usize> where F: FnMut(&'a T) -> B,B: Ord

// 对 slice 进行排序,unstable 表示不保证重复元素的顺序
pub fn sort_unstable(&mut self) where T: Ord
pub fn sort_unstable_by<F>(&mut self, compare: F) where F: FnMut(&T, &T) -> Ordering
pub fn sort_unstable_by_key<K, F>(&mut self, f: F) where F: FnMut(&T) -> K, K: Ord
pub fn select_nth_unstable( &mut self, index: usize) -> (&mut [T], &mut T, &mut [T]) where T: Ord
pub fn select_nth_unstable_by<F>( &mut self, index: usize, compare: F) -> (&mut [T], &mut T, &mut [T]) where F: FnMut(&T, &T) -> Ordering
pub fn select_nth_unstable_by_key<K, F>( &mut self, index: usize, f: F ) -> (&mut [T], &mut T, &mut [T]) where F: FnMut(&T) -> K, K: Ord
let mut v = [-5, 4, 1, -3, 2];
v.sort_unstable();
assert!(v == [-5, -3, 1, 2, 4]);

// 返回两个 slice,分别是没有重复的元素,重复的元素(没有顺序)
pub fn partition_dedup(&mut self) -> (&mut [T], &mut [T]) where T: PartialEq
pub fn partition_dedup_by<F>(&mut self, same_bucket: F) -> (&mut [T], &mut [T]) where F: FnMut(&mut T, &mut T) -> bool
pub fn partition_dedup_by_key<K, F>(&mut self, key: F) -> (&mut [T], &mut [T]) where F: FnMut(&mut T) -> K, K: PartialEq
#![feature(slice_partition_dedup)]
let mut slice = [1, 2, 2, 3, 3, 2, 1, 1];
let (dedup, duplicates) = slice.partition_dedup();
assert_eq!(dedup, [1, 2, 3, 2, 1]);
assert_eq!(duplicates, [2, 3, 1]);

// 向左轮转两个元素
pub fn rotate_left(&mut self, mid: usize)
pub fn rotate_right(&mut self, k: usize)
let mut a = ['a', 'b', 'c', 'd', 'e', 'f'];
a.rotate_left(2);
assert_eq!(a, ['c', 'd', 'e', 'f', 'a', 'b']);

// 使用指定值填充整个 slice
pub fn fill(&mut self, value: T) where T: Clone
// 使用指定函数返回值填充整个 slice
pub fn fill_with<F>(&mut self, f: F) where F: FnMut() -> T
let mut buf = vec![0; 10];
buf.fill(1);
assert_eq!(buf, vec![1; 10]);

// 从 src clone 元素到 self,src 和 self 的长度必须一致,否则 panic
pub fn clone_from_slice(&mut self, src: &[T]) where T: Clone
pub fn copy_from_slice(&mut self, src: &[T]) where T: Copy
let src = [1, 2, 3, 4];
let mut dst = [0, 0];
// Because the slices have to be the same length, we slice the source slice from four elements to
// two. It will panic if we don't do this.
dst.clone_from_slice(&src[2..]);
assert_eq!(src, [1, 2, 3, 4]);
assert_eq!(dst, [3, 4]);

// 使用 memmove 将 src 的范围元素移动到 dest 开始的位置,两者可以有重复
pub fn copy_within<R>(&mut self, src: R, dest: usize) where R: RangeBounds<usize>, T: Copy
let mut bytes = *b"Hello, World!";
bytes.copy_within(1..5, 8);
assert_eq!(&bytes, b"Hello, Wello!");

// 交换内容,两个 slice 的长度必须一致
pub fn swap_with_slice(&mut self, other: &mut [T])
let mut slice1 = [0, 0];
let mut slice2 = [1, 2, 3, 4];
slice1.swap_with_slice(&mut slice2[2..]);
assert_eq!(slice1, [3, 4]);
assert_eq!(slice2, [1, 2, 0, 0]);

pub unsafe fn align_to<U>(&self) -> (&[T], &[U], &[T])
pub unsafe fn align_to_mut<U>(&mut self) -> (&mut [T], &mut [U], &mut [T])

pub fn as_simd<const LANES: usize>(&self) -> (&[T], &[Simd<T, LANES>], &[T]) where Simd<T, LANES>: AsRef<[T; LANES]>, T: SimdElement, LaneCount<LANES>: SupportedLaneCount
pub fn as_simd_mut<const LANES: usize>( &mut self ) -> (&mut [T], &mut [Simd<T, LANES>], &mut [T]) where Simd<T, LANES>: AsMut<[T; LANES]>, T: SimdElement, LaneCount<LANES>: SupportedLaneCount

pub fn is_sorted(&self) -> bool where T: PartialOrd
pub fn is_sorted_by<'a, F>(&'a self, compare: F) -> bool where F: FnMut(&'a T, &'a T) -> bool
pub fn is_sorted_by_key<'a, F, K>(&'a self, f: F) -> bool where F: FnMut(&'a T) -> K, K: PartialOrd

// 返回 pred 返回 true 的 index
pub fn partition_point<P>(&self, pred: P) -> usize where P: FnMut(&T) -> bool
let v = [1, 2, 3, 3, 5, 6, 7];
let i = v.partition_point(|&x| x < 5);
assert_eq!(i, 4);
assert!(v[..i].iter().all(|&x| x < 5));
assert!(v[i..].iter().all(|&x| !(x < 5)));
let a = [2, 4, 8];
assert_eq!(a.partition_point(|x| x < &100), a.len());
let a: [i32; 0] = [];
assert_eq!(a.partition_point(|x| x < &100), 0);

// 从 self 拿出 range 元素并返回,self 是剩下的元素
pub fn take<R, 'a>(self: &mut &'a [T], range: R) -> Option<&'a [T]> where R: OneSidedRange<usize>
pub fn take_mut<R, 'a>(self: &mut &'a mut [T], range: R) -> Option<&'a mut [T]> where R: OneSidedRange<usize>
pub fn take_first<'a>(self: &mut &'a [T]) -> Option<&'a T>
pub fn take_first_mut<'a>(self: &mut &'a mut [T]) -> Option<&'a mut T>
pub fn take_last<'a>(self: &mut &'a [T]) -> Option<&'a T>
pub fn take_last_mut<'a>(self: &mut &'a mut [T]) -> Option<&'a mut T>
#![feature(slice_take)]
let mut slice: &[_] = &['a', 'b', 'c', 'd'];
let mut first_three = slice.take(..3).unwrap();
assert_eq!(slice, &['d']);
assert_eq!(first_three, &['a', 'b', 'c']);
#![feature(slice_take)]
let mut slice: &[_] = &['a', 'b', 'c', 'd'];
let mut tail = slice.take(2..).unwrap();
assert_eq!(slice, &['a', 'b']);
assert_eq!(tail, &['c', 'd']);
#![feature(slice_take)]
let mut slice: &[_] = &['a', 'b', 'c'];
let first = slice.take_first().unwrap();
assert_eq!(slice, &['b', 'c']);
assert_eq!(first, &'a');

pub unsafe fn get_many_unchecked_mut<const N: usize>( &mut self, indices: [usize; N] ) -> [&mut T; N]
pub fn get_many_mut<const N: usize>( &mut self, indices: [usize; N] ) -> Result<[&mut T; N], GetManyMutError<N>>
#![feature(get_many_mut)]
let v = &mut [1, 2, 3];
if let Ok([a, b]) = v.get_many_mut([0, 2]) {
    *a = 413;
    *b = 612;
}
assert_eq!(v, &[413, 2, 612]);

// 其它  [T] 方法
impl<T> [T]

pub fn sort(&mut self) where T: Ord
let mut v = [-5, 4, 1, -3, 2];
v.sort();
assert!(v == [-5, -3, 1, 2, 4]);

pub fn sort_by<F>(&mut self, compare: F) where F: FnMut(&T, &T) -> Ordering
pub fn sort_by_key<K, F>(&mut self, f: F) where F: FnMut(&T) -> K, K: Ord
pub fn sort_by_cached_key<K, F>(&mut self, f: F) where F: FnMut(&T) -> K, K: Ord
let mut v = [-5i32, 4, 1, -3, 2];
v.sort_by_key(|k| k.abs());
assert!(v == [1, 2, -3, 4, -5]);

// 从 slice 生成 Vec
pub fn to_vec(&self) -> Vec<T> where T: Clone
pub fn to_vec_in<A>(&self, alloc: A) -> Vec<T, A> where A: Allocator, T: Clone

let s = [10, 40, 30];
let x = s.to_vec();
// Here, `s` and `x` can be modified independently.

pub fn into_vec<A>(self: Box<[T], A>) -> Vec<T, A> where A: Allocator
let s: Box<[i32]> = Box::new([10, 40, 30]);
let x = s.into_vec();
// `s` cannot be used anymore because it has been converted into `x`.
assert_eq!(x, vec![10, 40, 30]);

pub fn repeat(&self, n: usize) -> Vec<T> where T: Copy
assert_eq!([1, 2].repeat(3), vec![1, 2, 1, 2, 1, 2]);

// 将 slice 打平为一个值 Self::Output
pub fn concat<Item>(&self) -> <[T] as Concat<Item>>::Output where [T]: Concat<Item>, Item: ?Sized
assert_eq!(["hello", "world"].concat(), "helloworld");
assert_eq!([[1, 2], [3, 4]].concat(), [1, 2, 3, 4]);

// 使用指定分隔符打平 slice
pub fn join<Separator>( &self, sep: Separator) -> <[T] as Join<Separator>>::Output where [T]: Join<Separator>
assert_eq!(["hello", "world"].join(" "), "hello world");
assert_eq!([[1, 2], [3, 4]].join(&0), [1, 2, 0, 3, 4]);
assert_eq!([[1, 2], [3, 4]].join(&[0, 0][..]), [1, 2, 0, 0, 3, 4]);

8 tuple
#

tuple 是固定大小和可以保存不同数据类型的类型,用 (T1, T2, T3) 表示。 可以使用 pattern match 进行析构,这使得元组非常灵活和强大,非常适合于存储和传递一组异构数据。元组也可以作为函数的返回值, 或者将数据组织成单个复合类型。

fn main() {
    let _t0: (u8,i16) = (0, -1);
    let _t1: (u8, (i16, u32)) = (0, (-1, 1));
    let t: (u8, u16, i64, &str, String) = (1u8, 2u16, 3i64, "hello", String::from(", world"));
    println!("Success!");
}

// 函数接受一个元组作为参数,并返回一个元组
fn swap(tup: (i32, f64)) -> (f64, i32) {
    // 返回一个新的元组,元素顺序与输入相反
    (tup.1, tup.0)
}
let input_tup = (123, 4.56);
let output_tup = swap(input_tup);

// 创建一个嵌套的元组结构
let nested_tup = (1, (2, 3), 4);
// 访问嵌套元组中的元素
let (a, (b, c), d) = nested_tup;
// 创建一个零元素的元组,也称为单元类型。
let unit = ();

单个元素类型时,元素后需要加逗号,如 (T,) ,以免和函数参数混淆。多个元素时,最后一个元素后可选的加逗号。

空 tuple () 也称为 unit type, 只有唯一的空值 ()。

tuple 拥有其中的各元素对象, 和 struct 一样, 允许部分元素被 move 走, 但是后续不能再访问已经 move 的元素:

  • array/Vec/slice 等集合不允许元素被 move 走,具体参考: 2

使用 index 访问各元素, 如 t.0, t.1 等.

析构 tuple: enum 类型是在枚举 variant 值外部而非内部类匹配 & 或 &mut 的, 对于 tuple 类型也是在 tuple 外部匹配 & 或 &mut 的:

let x: &Option<i32> = &Some(3);

// OK: 等效为 Some(ref y), y 的类型是 &i32
if let Some(y) = x {}
// OK: 在 variant 外指定 &,y 的类型是 i32
if let &Some(y) = x {}
// ERROR: 不能在 variant 内指定 &,expected `i32`, found `&_`
if let Some(&y) = x {}

let (a, b ) = &(1, 2); // a 和 b 都是 &i32 类型
println!("Results: {a} {b}");

let &(c, d ) = &(1, 2); // c 和 d 都是 i32 类型
println!("Results: {c} {d}");

let (&c, d ) = &(1, 2); // 错误
let (ref c, d ) = &(1, 2); // OK

// 另一个例子
enum MyEnum {
    A { name: String, x: u8 },
    B { name: String },
}
fn a_to_b(e: &mut MyEnum) {
    if let MyEnum::A {
        name,  // name 和 x 都是析构后的变量名,可以在后面的 block 中使用。name 是 &mut String 类型。
        x: 0,
    } = e {
        *e = MyEnum::B {
            name: std::mem::take(name), // take 参数类型是 &mut T, 而 name 类型是 &mut String 故满足
        }
    }
    // if let &mut MyEnum::A {
    //     name,  // OK: name 是 String 类型
    //     x: 0,
    // } = e
}

过长的 tuple 不能被格式化输出:

fn main() {
    let too_long_tuple = (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12);  // 最多 12 个元素才能被格式化
    println!("too long tuple: {:?}", too_long_tuple);
}

array 可以转换为相同长度的 tuple:

let array: [u32; 3] = [1, 2, 3];
let tuple: (u32, u32, u32) = array.into();

9 pointer
#

Rust 提供如下几种指针类型:

  1. 引用(Reference): &T 和 &mut T
  2. 裸指针(Raw Pointer): *const T 和 *mut T,需要在 unsafe 中解引用;
  3. 智能指针(Smart Pointer): 如 Box<T>, Rc<T>, Arc<T> 和 RefCell<T> 等。
fn main() {
    let x = 5;
    let y = &x; // 不可变引用

    let mut z = 10;
    let w = &mut z; // 可变引用,借用的值必须是 mut 类型
    *w += 1; // 解引用来修改值

    println!("x: {}, y: {}, z: {}, w: {}", x, y, z, w);
}

裸指针(Raw Pointer)可以是不可变 (*const T) 或可变 (*mut T),它们与 C 语言中的指针相似,但它们的使用不受安全检查,使用裸指针时需要 unsafe 代码块。

fn main() {
    let mut x = 10;
    let ptr_x = &mut x as *mut i32; // 将借用转换为可变裸指针

    unsafe {
        // 在 unsafe 代码块中使用裸指针
        *ptr_x += 10;
        println!("x: {}", *ptr_x);
    }
}

智能指针是实现了 Deref 和 Drop trait 的类型,用于额外的元数据和功能。Box 是最简单的智能指针,用来分配堆上的值。

fn main() {
    let b = Box::new(5); // 在堆上分配一个i32值
    println!("b: {}", b);

    let rc = Rc::new(5); // 创建一个引用计数指针
    let rc_clone = rc.clone(); // 增加引用计数
    println!("rc: {}, rc_clone: {}", rc, rc_clone);
}

指针内存布局:

  1. 普通引用和裸指针:占用一个机器字,一般是 isize 大小;
  2. 切片引用 &[T], 字符串引用 &str:占用两个机器字,分别保存内存区域指针+元素数量;
  3. Box<T>: 占用一个机器字,保存内存区域指针;
  4. Box<dyn Trait> 或 &dyn Trait: 占用两个机器字,保存实际对象的内存指针,以及该对象实现的各种方法的 vtable 指针;

10 struct
#

struct/enum/union 是 Rust 的三种自定义类型。自定义类型名必须是 CamelCase,否则编译时警告。

struct 三种类型:

  1. unit struct,不含任何 field;
  2. tuple struct,只有一个元素 T 的 struct 称为 newtype;
  3. C-like struct;
#![allow(dead_code)]

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
}

struct Unit;
struct Pair(i32, f32);
struct Point {
    x: f32,
    y: f32,
}

// 实例化
let _unit = Unit; // 对于 unit struct,只有唯一的一个对象。
let pair = Pair(1, 0.1);   // 初始化 tuple struct 时,类似于函数调用。
let Pair(integer, decimal) = pair;  // 解构 struct,注意前面的 Pair 不能省。

初始化 struct 对象时, 必须列出每一个 field,与 field 同名的变量赋值可以使用简写形式,可以使用某个 struct 对象展开来快速创建一个新的 struct 对象, 但它必须位于新 struct 初始化的最后一项且结尾不能有逗号。

fn main() {
    struct Person {
        name: String,
        age: u8,
        hobby: String,
    }

    let age = 30;
    let p = Person { // Error:missing field `hobby` in initializer of `Person`
        name: String::from("sunface"),
        age, // 与 field 同名的变量赋值, 可以使用简写形式。
    };
    println!("Success!");
}

// 同名的 field 可以简写。
let name = String::from("Peter");
let age = 27;
let peter = Person { name, age };

// newtype idiom, 一般为其他类型添加方法
struct Years(i64);
struct Days(i64);
impl Years {
    pub fn to_days(&self) -> Days {
        Days(self.0 * 365)
    }
}

impl Days {
    /// truncates partial years
    pub fn to_years(&self) -> Years {
        Years(self.0 / 365)
    }
}

fn old_enough(age: &Years) -> bool {
    age.0 >= 18
}

fn main() {
    let age = Years(5);
    let age_days = age.to_days();
    println!("Old enough {}", old_enough(&age));
    println!("Old enough {}", old_enough(&age_days.to_years()));
    // println!("Old enough {}", old_enough(&age_days));
}

// 使用 struct 对象初始化另一个 struct 对象。
#[derive(Debug)]
struct User {
    active: bool,
    username: String,
    email: String,
    sign_in_count: u64,
}
fn main() {
    let u1 = User {
        email: String::from("[email protected]"),
        username: String::from("sunface"),
        active: true,
        sign_in_count: 1,
    };
    let u2 = set_email(u1);
    println!("Success! {u2:?}");
}
fn set_email(u: User) -> User {
    User {
        email: String::from("[email protected]"),
        ..u // u 必须位于最后, 且结尾不能有逗号
    }
}
// Make a new point by using struct update syntax to use the fields of our other one
let bottom_right = Point { x: 5.2, ..point };

无 field 的 struct MyStruct; 等效于 struct MyStruct {}; :

struct Cookie;
let c = [Cookie, Cookie {}, Cookie, Cookie {}];
// 等效于
struct Cookie {}
const Cookie: Cookie = Cookie {};
let c = [Cookie, Cookie {}, Cookie, Cookie {}];

struct 会 owner 对应的 field value, 所以 field 一般使用 owned 类型而非 &T/&mut T 类型(‘static 除外), 因为后者需要声明生命周期参数。struct 包含引用类型成员时需要明确指定 lifetime。嵌套带声明周期的struct 时,外层 struct 也必须声明生命周期:

  • ‘a: ‘b 表示 ‘a 的 lifetime 至少要比 ‘b 长。
  • T: ‘a 表示 T 的生命周期要比 ‘a 长.
struct S {
    r: &i32 // r 是引用类型,但是没有指定 lifetime,编译失败。
}
let s;
{
    let x = 10;
    s = S { r: &x };
}
assert_eq!(*s.r, 10); // bad: reads from dropped `x`


// 正确
struct S {
    r: &'static i32
}
// 正确
struct S<'a> {
    r: &'a i32  // r 引用对象的声明周期至少要比 struct S 大。
}
// 正确,多个 lifetime 参数
struct S<'a, 'b> {
    x: &'a i32,
    y: &'b i32
}
// 函数
fn f<'a, 'b>(r: &'a i32, s: &'b i32) -> &'a i32 { r } // looser


// 错误
struct D {
    s: S
}
// 正确
struct D<'a> {
    s: S<'a>
}

struct 整体和各 field 需要单独设置 public (enum 是整体 public 即可), 没有设置 public 的 filed 默认是私有的, 其他 moudule 不能访问.

struct 默认没有实现 Copy/Clone 以及 Debug, 可以通过 derive 宏来让编译器自动生成.

  • 不能通过 derive 属性来生成 Display trait, 需要手动实现该 trait.

struct 的各 field 可以被单独借用,在被 Destructure 时,如果 filed 没有实现 Copy,这可能会被 partial move, move 的 field 后续不能再访问:

  • enum 也可以被 partial move;
  • array/tuple/vec 元素不能被 partial move, 但可以整体或全部没 move 出来(如迭代)。
fn main() {
    #[derive(Debug)]
    struct Person {
        name: String,
        age: Box<u8>,
    }

    let person = Person {
        name: String::from("Alice"),
        age: Box::new(20),
    };

    // `name` is moved out of person, but `age` is referenced
    let Person { name, ref age } = person; // struct 可以作为 pattern match 来进行解构
    println!("The person's age is {}", age);
    println!("The person's name is {}", name);

    // Error! borrow of partially moved value: `person` partial move occurs
    //println!("The person struct is {:?}", person);
    // `person` cannot be used but `person.age` can be used as it is not moved
    println!("The person's age from person struct is {}", person.age);
}

11 enum
#

enum variant 和 struct 类似,有 3 种类型:

  1. enum Quit;
  2. enum Quit {x: y, xx:yy};
  3. enum Quit (i32, String);
enum Number {
    Zero, // tag 默认在上一个基础上递增,第一个 tag 为 0。
    One,
    Two,
}

enum Number1 {
    Zero = 0,
    One,
    Two,
}

// C-like enum
enum Number2 {
    Zero = 0.0,
    One = 1.0,
    Two = 2.0,
}

// enum variant 可以包含数据。
enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}

// 不允许多个 field 使用相同的 tag 值(但是 C 允许)。
enum SharedDiscriminantError2 {
    Zero,       // 0
    One,        // 1
    OneToo = 1  // 1 (collision with previous!)
}

特殊的空 enum (无 variant)不能作为 value 使用, 主要的使用场景是作为不可能发生错误的 Result,如标准库类型 std::convert::Infallible:

// std::convert::Infallible
pub enum Infallible {}

impl<T, U> TryFrom<U> for T where U: Into<T> {
    type Error = Infallible;

    fn try_from(value: U) -> Result<Self, Infallible> {
        Ok(U::into(value))  // Never returns `Err`
    }
}

// 另一个例子
enum ZeroVariants {}
let x: ZeroVariants = panic!();
let y: u32 = x; // mismatched type error

enum variant 的数据可以用在 pattern match 中:

// Create an `enum` to classify a web event. Note how both names and type information together
// specify the variant: `PageLoad != PageUnload` and `KeyPress(char) != Paste(String)`.  Each is
// different and independent.
enum WebEvent {
    // An `enum` variant may either be `unit-like`,
    PageLoad,
    PageUnload,
    // like tuple structs,
    KeyPress(char),
    Paste(String),
    // or c-like structures.
    Click { x: i64, y: i64 },
}

// A function which takes a `WebEvent` enum as an argument and returns nothing.
fn inspect(event: WebEvent) {
    match event {
        WebEvent::PageLoad => println!("page loaded"),
        WebEvent::PageUnload => println!("page unloaded"),
        // Destructure `c` from inside the `enum` variant.
        WebEvent::KeyPress(c) => println!("pressed '{}'.", c),
        WebEvent::Paste(s) => println!("pasted \"{}\".", s),
        // Destructure `Click` into `x` and `y`.
        WebEvent::Click { x, y } => {
            println!("clicked at x={}, y={}.", x, y);
        },
    }
}

fn main() {
    // 创建一个 enum variant 时需要指定对应的类型值(tuple、struct)
    let pressed = WebEvent::KeyPress('x');
    // `to_owned()` creates an owned `String` from a string slice.
    let pasted  = WebEvent::Paste("my text".to_owned());
    let click   = WebEvent::Click { x: 20, y: 80 };
    let load    = WebEvent::PageLoad;
    let unload  = WebEvent::PageUnload;
    inspect(pressed);
    inspect(pasted);
    inspect(click);
    inspect(load);
    inspect(unload);
}

enum 的各 variant 都是 enum 类型, 所以可以用在 array 中:

enum Message {
    Quit,
    Move { x: i32, y: i32 },
    Write(String),
    ChangeColor(i32, i32, i32),
}
fn main() {
    let msgs: [Message; 3] = [
        Message::Quit,
        Message::Move{x:1, y:3},
        Message::ChangeColor(255,255,0)
    ];
    for msg in msgs {
        show_message(msg)
    }
}
fn show_message(msg: Message) {
    println!("{}", msg);
}

enum variant 可以包含 tag 表达式,使用 enum::variant as i32/u32 来获得 tag 值:

// An attribute to hide warnings for unused code.
#![allow(dead_code)]

// enum with implicit discriminator (starts at 0)
enum Number {
    Zero,  // 默认从 0 开始递增,未指定时在上一个基础上递增。
    One,
    Two,
}

// enum with explicit discriminator
enum Color {
    Red = 0xff0000,
    Green = 0x00ff00,
    Blue // 上一基础上自动递增,所以为 0x00ff01
}

fn main() {
    // `enums` can be cast as integers.
    println!("zero is {}", Number::Zero as i32);
    println!("one is {}", Number::One as i32);
    println!("roses are #{:06x}", Color::Red as i32);
    println!("violets are #{:06x}", Color::Blue as i32);
}

如果 enum 名称太长,可以用 type alias 来简化:

enum VeryVerboseEnumOfThingsToDoWithNumbers {
    Add,
    Subtract,
}

// Creates a type alias
type Operations = VeryVerboseEnumOfThingsToDoWithNumbers;

fn main() {
    // We can refer to each variant via its alias, not its long and inconvenient name.
    let x = Operations::Add; // 使用简化的 enum 类型别名来访问 variant
}

// 最常见的场景是方法中的 Self 类型其实也是 type alias
enum VeryVerboseEnumOfThingsToDoWithNumbers {
    Add,
    Subtract,
}
impl VeryVerboseEnumOfThingsToDoWithNumbers {
    fn run(&self, x: i32, y: i32) -> i32 {
        match self {
            Self::Add => x + y,
            Self::Subtract => x - y,
        }
    }
}

enum 的 variant 可以使用 use 按需或一次性导入,这样不需要每次指定 enum::variant 的 enum:: 部分:

#![allow(dead_code)]
enum Status {
    Rich,
    Poor,
}
enum Work {
    Civilian,
    Soldier,
}

fn main() {
    // Explicitly `use` each name so they are available without manual scoping.
    use crate::Status::{Poor, Rich};
    // Automatically `use` each name inside `Work`.
    use crate::Work::*;

    // Equivalent to `Status::Poor`.
    let status = Poor;
    // Equivalent to `Work::Civilian`.
    let work = Civilian;

    match status {
        // Note the lack of scoping because of the explicit `use` above.
        Rich => println!("The rich have lots of money!"),
        Poor => println!("The poor have no money..."),
    }

    match work {
        // Note again the lack of scoping.
        Civilian => println!("Civilians work!"),
        Soldier  => println!("Soldiers fight!"),
    }
}

enum 只需为整体指定 pub 可见性即可,各 variant 的可见性继承自整体。(struct 需要为每个 field 指定可见性)。

析构 enum:对于 enum 类型是在枚举 variant 值外部而非内部类匹配 & 或 &mut 的:

let x: &Option<i32> = &Some(3);

// OK: 等效为 Some(ref y), y 的类型是 &i32
if let Some(y) = x {}
// OK: 在 variant 外指定 &,y 的类型是 i32
if let &Some(y) = x {}
// ERROR: 不能在 variant 内指定 &,expected `i32`, found `&_`
if let Some(&y) = x {}

let (a, b ) = &(1, 2); // a 和 b 都是 &i32 类型
println!("Results: {a} {b}");

let &(c, d ) = &(1, 2); // c 和 d 都是 i32 类型
println!("Results: {c} {d}");

let (&c, d ) = &(1, 2); // 报错
let (ref c, d ) = &(1, 2); // OK

// 另一个例子
enum MyEnum {
    A { name: String, x: u8 },
    B { name: String },
}
fn a_to_b(e: &mut MyEnum) {
    if let MyEnum::A {
        name,  // name 和 x 都是析构后的变量名,可以在后面的 block 中使用。name 是 &mut String 类型。
        x: 0,
    } = e {
        *e = MyEnum::B {
            name: std::mem::take(name), // take 参数类型是 &mut T, 而 name 类型是 &mut String 故满足
        }
    }
    // if let &mut MyEnum::A {
    //     name,  // OK: name 是 String 类型
    //     x: 0,
    // } = e
}

enum 的内存布局包括:tag 字段 和能容纳所有 variant 的内存,其中 tag 是 Rust 内部用来区分 variant 的。

12 panic/error/Option/Result
#

panic 是最简单的异常处理机制,它打印 error message,然后开始 unwinding stack,最后退出当前 thread。 unwinding stack 过程中,Rust 会回溯调用栈,drop 所有的对象和资源。

  • 如果是 main thread panic,则整个程序退出,否则,如果是子线程 panic,则终止该子线程,程序不退出。
  • 使用 RUST_BACKTRACE=1 cargo run 来打印 stack 详情;
fn drink(beverage: &str) {
    // You shouldn't drink too much sugary beverages.
    if beverage == "lemonade" { panic!("AAAaaaaa!!!!"); }

    println!("Some refreshing {} is all I need.", beverage);
}

fn main() {
    drink("water");
    drink("lemonade");
    drink("still water");
}

注意:如果 panic 是 FFI 调用的外部库函数导致的,则 Rust 不会进行 unwinding,而是直接 panic。

panic 时是 unwind(默认)还是 abort,可以配置:

  • 在 .cargo/config.toml 的 profile 中配置为 abort:
  • 通过 cargo build 或 rustc 的 -C panic=abort/unwind 参数来配置: rustc lemonade.rs -C panic=abort
[profile.dev]
opt-level = 0
debug = true
split-debuginfo = '...'  # Platform-specific.
strip = "none"
debug-assertions = true
overflow-checks = true
lto = false
panic = 'unwind'  # unwind 或 abort
incremental = true
codegen-units = 256
rpath = false

代码可以使用 #[cfg(panic = “xx”)] 来进行条件编译,使用 cfg!() 来进行条件判断;

// 根据 panic 设置进行条件编译
#[cfg(panic = "unwind")]
fn ah() {
    println!("Spit it out!!!!");
}

#[cfg(not(panic = "unwind"))]
fn ah() {
    println!("This is not your party. Run!!!!");
}

fn drink(beverage: &str) {
    if beverage == "lemonade" {
        ah();
    } else {
        println!("Some refreshing {} is all I need.", beverage);
    }
}

fn main() {
    drink("water");
    drink("lemonade");
}


fn drink(beverage: &str) {
    // You shouldn't drink too much sugary beverages.
    if beverage == "lemonade" {
        if cfg!(panic = "abort") {
            println!("This is not your party. Run!!!!");
        } else {
            println!("Spit it out!!!!");
        }
    } else {
        println!("Some refreshing {} is all I need.", beverage);
    }
}

fn main() {
    drink("water");
    drink("lemonade");
}

Option/Result 是 enum 类型,支持迭代(实现了 IntoIterator),效果就如一个或 0 个元素。可以使用 ? 来进行 unpacking,? 可以用于方法调用等表达式中间来使用。

fn next_birthday(current_age: Option<u8>) -> Option<String> {
    // If `current_age` is `None`, this returns `None`.  If `current_age` is `Some`, the inner `u8`
    // value + 1 gets assigned to `next_age`
    let next_age: u8 = current_age? + 1; // unpacing Some 值或提前返回 None
    Some(format!("Next year I will be {}", next_age))
}

struct Person {
    job: Option<Job>,
}

#[derive(Clone, Copy)]
struct Job {
    phone_number: Option<PhoneNumber>,
}

#[derive(Clone, Copy)]
struct PhoneNumber {
    area_code: Option<u8>,
    number: u32,
}

impl Person {
    // Gets the area code of the phone number of the person's job, if it exists.
    fn work_phone_area_code(&self) -> Option<u8> {
        // This would need many nested `match` statements without the `?` operator.
        // It would take a lot more code - try writing it yourself and see which
        // is easier.
        self.job?.phone_number?.area_code
    }
}

fn main() {
    let p = Person {
        job: Some(Job {
            phone_number: Some(PhoneNumber {
                area_code: Some(61),
                number: 439222222,
            }),
        }),
    };
    assert_eq!(p.work_phone_area_code(), Some(61));
}

12.1 Option
#

Option 的方法:

impl<T> Option<T>

pub const fn is_some(&self) -> bool
pub const fn is_none(&self) -> bool
pub fn is_some_and(self, f: impl FnOnce(T) -> bool) -> bool
let x: Option<u32> = Some(2);
assert_eq!(x.is_some_and(|x| x > 1), true);
let x: Option<u32> = Some(0);
assert_eq!(x.is_some_and(|x| x > 1), false);

// 从 &Option<T> 转换为 Option<&T>,这样后续处理时不转移 T 所有权
pub const fn as_ref(&self) -> Option<&T>
let text: Option<String> = Some("Hello, world!".to_string());
let text_length: Option<usize> = text.as_ref().map(|s| s.len());
println!("still can print text: {text:?}");

pub fn as_mut(&mut self) -> Option<&mut T>
let mut x = Some(2);
match x.as_mut() {
    Some(v) => *v = 42,
    None => {},
}
assert_eq!(x, Some(42));

pub fn as_pin_ref(self: Pin<&Option<T>>) -> Option<Pin<&T>>
pub fn as_pin_mut(self: Pin<&mut Option<T>>) -> Option<Pin<&mut T>>

// 返回一个 slice,包含对应的 Some 元素,如果为 None 则返回空 slice
pub fn as_slice(&self) -> &[T]
pub fn as_mut_slice(&mut self) -> &mut [T]
assert_eq!(
    [Some(1234).as_slice(), None.as_slice()],
    [&[1234][..], &[][..]],
);

// 返回 Some 值,如果是 None 则 panic 并打印 msg
pub fn expect(self, msg: &str) -> T
let x = Some("value");
assert_eq!(x.expect("fruits are healthy"), "value");

// 返回 Some 值,如果是 None 则 panic
pub fn unwrap(self) -> T

 // 返回 Some 值,如果是 None,则使用缺省值、或函数返回值
pub fn unwrap_or(self, default: T) -> T
pub fn unwrap_or_else<F>(self, f: F) -> T where F: FnOnce() -> T
pub fn unwrap_or_default(self) -> T where T: Default
pub unsafe fn unwrap_unchecked(self) -> T

// 将 Option<T> 转换为 Option<U>, 如果为 None 则返回 None
pub fn map<U, F>(self, f: F) -> Option<U> where F: FnOnce(T) -> U
let maybe_some_string = Some(String::from("Hello, World!"));
let maybe_some_len = maybe_some_string.map(|s| s.len());
assert_eq!(maybe_some_len, Some(13));
let x: Option<&str> = None;
assert_eq!(x.map(|s| s.len()), None);

pub fn inspect<F>(self, f: F) -> Option<T> where F: FnOnce(&T)
let v = vec![1, 2, 3, 4, 5];
let x: Option<&usize> = v.get(3).inspect(|x| println!("got: {x}")); // prints "got: 4"
let x: Option<&usize> = v.get(5).inspect(|x| println!("got: {x}")); // prints nothing

// 如果为 None 则返回 default 值, 否则对 Some 值执行 f 函数
pub fn map_or<U, F>(self, default: U, f: F) -> U where F: FnOnce(T) -> U
pub fn map_or_else<U, D, F>(self, default: D, f: F) -> U where D: FnOnce() -> U, F: FnOnce(T) -> U
let x = Some("foo");
assert_eq!(x.map_or(42, |v| v.len()), 3);
let x: Option<&str> = None;
assert_eq!(x.map_or(42, |v| v.len()), 42);

// 将 Option 转换为 Result: 将 Some(v) -> Ok(v), None -> Err(err)
pub fn ok_or<E>(self, err: E) -> Result<T, E>
pub fn ok_or_else<E, F>(self, err: F) -> Result<T, E> where F: FnOnce() -> E
let x = Some("foo");
assert_eq!(x.ok_or(0), Ok("foo"));
let x: Option<&str> = None;
assert_eq!(x.ok_or(0), Err(0));

pub fn as_deref(&self) -> Option<&<T as Deref>::Target> where T: Deref
pub fn as_deref_mut(&mut self) -> Option<&mut <T as Deref>::Target> where T: DerefMut

pub fn iter(&self) -> Iter<'_, T>
pub fn iter_mut(&mut self) -> IterMut<'_, T>
let x = Some(4);
assert_eq!(x.iter().next(), Some(&4));
let x: Option<u32> = None;
assert_eq!(x.iter().next(), None);

// 如果 self 是 None 则返回 None, 否则返回 optb
pub fn and<U>(self, optb: Option<U>) -> Option<U>
let x = Some(2);
let y: Option<&str> = None;
assert_eq!(x.and(y), None);
let x: Option<u32> = None;
let y = Some("foo");
assert_eq!(x.and(y), None);
let x = Some(2);
let y = Some("foo");
assert_eq!(x.and(y), Some("foo"));
let x: Option<u32> = None;
let y: Option<&str> = None;
assert_eq!(x.and(y), None);

// 如果 self 是 None 则返回 None, 否则返回 f 函数的结果
pub fn and_then<U, F>(self, f: F) -> Option<U> where F: FnOnce(T) -> Option<U>
fn sq_then_to_string(x: u32) -> Option<String> {
    x.checked_mul(x).map(|sq| sq.to_string())
}
assert_eq!(Some(2).and_then(sq_then_to_string), Some(4.to_string()));
assert_eq!(Some(1_000_000).and_then(sq_then_to_string), None); // overflowed!
assert_eq!(None.and_then(sq_then_to_string), None);

// 如果 self 是 None 则返回 None, 否则如果 predicate 返回 true 则返回 Some 值;
pub fn filter<P>(self, predicate: P) -> Option<T> where P: FnOnce(&T) -> bool
fn is_even(n: &i32) -> bool {
    n % 2 == 0
}
assert_eq!(None.filter(is_even), None);
assert_eq!(Some(3).filter(is_even), None);
assert_eq!(Some(4).filter(is_even), Some(4));

pub fn or(self, optb: Option<T>) -> Option<T>
pub fn or_else<F>(self, f: F) -> Option<T> where F: FnOnce() -> Option<T>
pub fn xor(self, optb: Option<T>) -> Option<T>
let x = Some(2);
let y = None;
let x = None;
let y = Some(100);
assert_eq!(x.or(y), Some(100));
let x = Some(2);
let y = Some(100);
assert_eq!(x.or(y), Some(2));
let x: Option<u32> = None;
let y = None;
assert_eq!(x.or(y), None);

// 将 value 插入 Option 返回他的 &mut, Option 原来值被 dropped
pub fn insert(&mut self, value: T) -> &mut T
let mut opt = None;
let val = opt.insert(1);
assert_eq!(*val, 1);
assert_eq!(opt.unwrap(), 1);
let val = opt.insert(2);
assert_eq!(*val, 2);
*val = 3;
assert_eq!(opt.unwrap(), 3);

// 返回 Some 值的 &mut, 否则插入 value 值并返回他的 &mut
pub fn get_or_insert(&mut self, value: T) -> &mut T
pub fn get_or_insert_default(&mut self) -> &mut T where T: Default
pub fn get_or_insert_with<F>(&mut self, f: F) -> &mut T where F: FnOnce() -> T
let mut x = None;
{
    let y: &mut u32 = x.get_or_insert(5);
    assert_eq!(y, &5);
    *y = 7;
}
assert_eq!(x, Some(7));

// 从 self 中获取 Some 值, 将 self 设为 None
pub fn take(&mut self) -> Option<T>
let mut x = Some(2);
let y = x.take();
assert_eq!(x, None);
assert_eq!(y, Some(2));
let mut x: Option<u32> = None;
let y = x.take();
assert_eq!(x, None);
assert_eq!(y, None);

// 当 predicate 返回 ture 时 take Some 的值,将 self 设置为 None
pub fn take_if<P>(&mut self, predicate: P) -> Option<T> where P: FnOnce(&mut T) -> bool

// 用 value 替换 self 值, 返回 self 以前的值
pub fn replace(&mut self, value: T) -> Option<T>
let mut x = Some(2);
let old = x.replace(5);
assert_eq!(x, Some(5));
assert_eq!(old, Some(2));
let mut x = None;
let old = x.replace(3);
assert_eq!(x, Some(3));
assert_eq!(old, None);

// 如果 self 是 Some(s) 且 other 也是 Some(o),则返回 Some((s, o)), 否则返回 None
pub fn zip_with<U, F, R>(self, other: Option<U>, f: F) -> Option<R> where F: FnOnce(T, U) -> R
pub fn zip<U>(self, other: Option<U>) -> Option<(T, U)>
let x = Some(1);
let y = Some("hi");
let z = None::<u8>;
assert_eq!(x.zip(y), Some((1, "hi")));
assert_eq!(x.zip(z), None);

// 从 Option<&T> 生成 Option<T>
impl<T> Option<&T>
pub fn copied(self) -> Option<T> where T: Copy
pub fn cloned(self) -> Option<T> where T: Clone

12.2 Result
#

支持 map/and_then 等方法:

use std::num::ParseIntError;

// As with `Option`, we can use combinators such as `map()`.  This function is otherwise identical
// to the one above and reads: Multiply if both values can be parsed from str, otherwise pass on the
// error.
fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    first_number_str.parse::<i32>().and_then(|first_number| {
        second_number_str.parse::<i32>().map(|second_number| first_number * second_number)
    })
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    // This still presents a reasonable answer.
    let twenty = multiply("10", "2");
    print(twenty);

    // The following now provides a much more helpful error message.
    let tt = multiply("t", "2");
    print(tt);
}

在 match 表达式中可以提前返回 Err(e):

use std::num::ParseIntError;

fn multiply(first_number_str: &str, second_number_str: &str) -> Result<i32, ParseIntError> {
    let first_number = match first_number_str.parse::<i32>() {
        Ok(first_number)  => first_number,
        Err(e) => return Err(e),
    };

    let second_number = match second_number_str.parse::<i32>() {
        Ok(second_number)  => second_number,
        Err(e) => return Err(e),
    };

    Ok(first_number * second_number)
}

fn print(result: Result<i32, ParseIntError>) {
    match result {
        Ok(n)  => println!("n is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    print(multiply("10", "2"));
    print(multiply("t", "2"));
}

Result 别名: 简化 Error 类型:

use std::fmt;

type Result<T> = std::result::Result<T, DoubleError>;

// Define our error types. These may be customized for our error handling cases.
// Now we will be able to write our own errors, defer to an underlying error
// implementation, or do something in between.
#[derive(Debug, Clone)]
struct DoubleError;

// Generation of an error is completely separate from how it is displayed.
// There's no need to be concerned about cluttering complex logic with the display style.
//
// Note that we don't store any extra info about the errors. This means we can't state
// which string failed to parse without modifying our types to carry that information.
impl fmt::Display for DoubleError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    vec.first()
        // Change the error to our new type.
        .ok_or(DoubleError)
        .and_then(|s| {
            s.parse::<i32>()
                // Update to the new error type here also.
                .map_err(|_| DoubleError)
                .map(|i| 2 * i)
        })
}

fn print(result: Result<i32>) {
    match result {
        Ok(n) => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

如果一个表达式返回 Result, 则忽略返回值时编译器会警告, 可以赋值给 let _ = xxx 来消除警告.

12.3 Error
#

Rust 标准库提供了 std::error::Error trait,标准库中绝大部分错误类型,如 std::io::Error, std::fmt::Error 类型都实现了该 trait。而且标准库为 std::error::Error 实现了到 Box<dyn Error + ‘a> 和 Box<dyn Error + Sync + Send + ‘a> 的 From trait 转换实现. 所以实现了 std::error:Error trait 的错误类型都可以使用 ? 转换到 Box<dyn Error + ‘a>和 Box<dyn Error + Sync + Send + ‘a> 类型:

impl<'a, E> From<E> for Box<dyn Error + 'a> where E: Error + 'a,
impl<'a, E> From<E> for Box<dyn Error + Sync + Send + 'a> where E: Error + Send + Sync + 'a,

最佳实践:函数返回的 Error 类型使用 Box<std::error::Error + Send + Sync + ‘static> :

  • 加 Send + Sync + ‘static 后可以让 trait object 来跨线程返回, 例如在 aysnc spawn 场景中;
  • 标准库的其他 error 类型都实现了到 std::error::Error 和上面 trait object 的转换;

标准库为实现 std::error::Error trait 的类型都实现了 ToString、Display 和 Debug trait, 可以用 println!() 直接打印:

println!("error querying the weather: {}", err);
println!("error querying the weather: {:?}", err);

use std::error;
use std::fmt;

// Change the alias to use `Box<dyn error::Error>`.
type Result<T> = std::result::Result<T, Box<dyn error::Error>>;

#[derive(Debug, Clone)]
struct EmptyVec;  // 自定义 error 类型
impl fmt::Display for EmptyVec {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        write!(f, "invalid first item to double")
    }
}
impl error::Error for EmptyVec {} // std::error::Error 是 marker trait

fn double_first(vec: Vec<&str>) -> Result<i32> {
    vec.first()
        .ok_or_else(|| EmptyVec.into()) // Converts to Box
        .and_then(|s| {
            s.parse::<i32>()
                .map_err(|e| e.into()) // Converts to Box
                .map(|i| 2 * i)
        })
}

// // The same structure as before but rather than chain all `Results`
// // and `Options` along, we `?` to get the inner value out immediately.
// fn double_first(vec: Vec<&str>) -> Result<i32> {
//     let first = vec.first().ok_or(EmptyVec)?;
//     let parsed = first.parse::<i32>()?;
//     Ok(2 * parsed)
// }


fn print(result: Result<i32>) {
    match result {
        Ok(n) => println!("The first doubled is {}", n),
        Err(e) => println!("Error: {}", e),
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

自定义 Error 类型, 提供更丰富/个性化的上下文和出错信息:

  • 实现 fmt::Display 的 fmt() 方法;
  • 实现 std::error::Error trait( Error trait 都有默认实现,可以只是标记实现)。
  • 实现 From<XX> trait, 将其他类型错误 XX 转换为自定义类型错误。如果转换不一定成功,可以使用 TryFrom<XX> trait, 它返回一个 Result 来指示是否转换成功。
use std::error;
use std::error::Error;
use std::num::ParseIntError;
use std::fmt;

type Result<T> = std::result::Result<T, DoubleError>;

#[derive(Debug)]
enum DoubleError {
    EmptyVec,
    // We will defer to the parse error implementation for their error.  Supplying extra info
    // requires adding more data to the type.
    Parse(ParseIntError),
}

impl fmt::Display for DoubleError {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match *self {
            DoubleError::EmptyVec =>
                write!(f, "please use a vector with at least one element"),
            // The wrapped error contains additional information and is available via the source()
            // method.
            DoubleError::Parse(..) =>
                write!(f, "the provided string could not be parsed as int"),
        }
    }
}

impl error::Error for DoubleError {
    fn source(&self) -> Option<&(dyn error::Error + 'static)> {
        match *self {
            DoubleError::EmptyVec => None,
            // The cause is the underlying implementation error type. Is implicitly cast to the
            // trait object `&error::Error`. This works because the underlying type already
            // implements the `Error` trait.
            DoubleError::Parse(ref e) => Some(e),
        }
    }
}

// Implement the conversion from `ParseIntError` to `DoubleError`.  This will be automatically
// called by `?` if a `ParseIntError` needs to be converted into a `DoubleError`.
impl From<ParseIntError> for DoubleError {
    fn from(err: ParseIntError) -> DoubleError {
        DoubleError::Parse(err)
    }
}

fn double_first(vec: Vec<&str>) -> Result<i32> {
    let first = vec.first().ok_or(DoubleError::EmptyVec)?;
    // Here we implicitly use the `ParseIntError` implementation of `From` (which we defined above)
    // in order to create a `DoubleError`.
    let parsed = first.parse::<i32>()?;

    Ok(2 * parsed)
}

fn print(result: Result<i32>) {
    match result {
        Ok(n)  => println!("The first doubled is {}", n),
        Err(e) => {
            println!("Error: {}", e);
            if let Some(source) = e.source() {
                println!("  Caused by: {}", source);
            }
        },
    }
}

fn main() {
    let numbers = vec!["42", "93", "18"];
    let empty = vec![];
    let strings = vec!["tofu", "93", "18"];

    print(double_first(numbers));
    print(double_first(empty));
    print(double_first(strings));
}

thiserror crate 为自定义 Error 类型提供了基于宏的声明式实现:

// https://github.com/Abraxas-365/langchain-rust/blob/main/src/agent/error.rs
use thiserror::Error;

use crate::{chain::ChainError, language_models::LLMError, prompt::PromptError};

#[derive(Error, Debug)]
pub enum AgentError {
    #[error("LLM error: {0}")]
    LLMError(#[from] LLMError),

    #[error("Chain error: {0}")]
    ChainError(#[from] ChainError),

    #[error("Prompt error: {0}")]
    PromptError(#[from] PromptError),

    #[error("Tool error: {0}")]
    ToolError(String),

    #[error("Missing Object On Builder: {0}")]
    MissingObject(String),

    #[error("Missing input variable: {0}")]
    MissingInputVariable(String),

    #[error("Serde json error: {0}")]
    SerdeJsonError(#[from] serde_json::Error),

    #[error("Error: {0}")]
    OtherError(String),
}

13 const/static/lazy_static!
#

Rust 支持两种 const 常量,可以在全局,函数,block 等 scope 中声明,全局常量需要使用 全大写名称 ,否则编译器警告;

  1. const:不可变值;
  2. static:可能可变的(static mut),需要在 unsafe 中读写 static mut 值;
// Globals are declared outside all other scopes.
const THRESHOLD: i32 = 10; // 全局常量
static LANGUAGE: &str = "Rust"; // 全局常量,默认带 'static
// 全局 static 可变变量, 需要在 unsafe 代码中访问
static mut stat_mut = "abc";

fn is_big(n: i32) -> bool {
    n > THRESHOLD
}

fn main() {
    let n = 16;

    println!("This is {}", LANGUAGE);
    println!("The threshold is {}", THRESHOLD);
    println!("{} is {}", n, if is_big(n) { "big" } else { "small" });

    // Error! Cannot modify a `const`.
    THRESHOLD = 5;
}

// static 变量也可以定义在函数中,和 C 的 static 变量类似,在函数返回时变量仍有效,在程序整个生命周
// 期均有效。
fn computation() -> &'static DeepThought {
    // n.b. static items do not call [`Drop`] on program termination, so if
    // [`DeepThought`] impls Drop, that will not be used for this instance.
    static COMPUTATION: OnceLock<DeepThought> = OnceLock::new();
    COMPUTATION.get_or_init(|| DeepThought::new())
}

对 const/static 变量的初始化, 只能使用 const 函数/tuple 类型。

为了克服上面的 const/static 变量初始化的局限性,可以使用 lazy_static! 宏定义静态变量,可以使用任何表达式进行初始化,表达式会在变量第一次解引用时运行,值会被存储在变量中以便后续使用。

use std::sync::Mutex;
lazy_static! {
    static ref HOSTNAME: Mutex<String> = Mutex::new(String::new()); // 任意初始化表达式
}

const/static 默认具有 ‘static lifetime., 不可修改的,所以一般使用支持内部可变性的 Mutex/AtomicXX 来作为全局对象的类型:

const BIT1: u32 = 1 << 0;
const BIT2: u32 = 1 << 1;

const BITS: [u32; 2] = [BIT1, BIT2];
const STRING: &'static str = "bitstring";

struct BitsNStrings<'a> {
    mybits: [u32; 2],
    mystring: &'a str,
}

const BITS_N_STRINGS: BitsNStrings<'static> = BitsNStrings {
    mybits: BITS,
    mystring: STRING,
};


use std::sync::Mutex;
use std::sync::atomic::AtomicUsize;

static MY_GLOBAL: Vec<usize> = Vec::new(); // OK, 但是不可修改。
static PACKETS_SERVED: AtomicUsize = AtomicUsize::new(0); // ok, 可以修改
static HOSTNAME: Mutex<String> = Mutex::new(String::new()); // ok, 可以修改
fn main() {
    let mut name =  HOSTNAME.lock().unwrap();
    name.push_str("localhost");
    println!("Results: {name}");
}

14 refer/borrow
#

RAII:Rust 中每一个资源或对象只能有一个 Owner 变量,在 Owner 离开作用域 scope 时,它的 Drop trait 被调用来释放资源。

Rust 对象都有唯一的所有权,所有权可以通过赋值表达式、函数传参、函数返回、添加到 struct/tuple 和集合等来转移所有权, 原来的变量变成 未初始化状态, 不能再使用, 可以避免 dangling pointers。一般情况下堆上分配的对象,例如 String/Vec 没有实现 Copy。自定义类型,如 struct/enum/union 也没有实现 Copy。这种转移 Move 的方式,在性能上和安全性上都是非常有效的(避免了栈和堆内存拷贝),Rust 编译器也会对转移的变量进行错误检查。

// 所有权转移,所以可以从栈上返回对象:
fn new_person() -> Person {
    let person = Person {
        name : String::from("Hao Chen"),
        age : 44,
        sex : Sex::Male,
        email: String::from("[email protected]"),
    };
    return person;
}
fn main() {
    let p  = new_person();
}

fn create_box() {
    // Allocate an integer on the heap
    let _box1 = Box::new(3i32);
    // `_box1` is destroyed here, and memory gets freed
}

fn main() {
    // Allocate an integer on the heap
    let _box2 = Box::new(5i32);

    // A nested scope:
    {
        // Allocate an integer on the heap
        let _box3 = Box::new(4i32);

        // `_box3` is destroyed here, and memory gets freed
    }

    // Creating lots of boxes just for fun
    // There's no need to manually free memory!
    for _ in 0u32..1_000 {
        create_box();
    }

    // `_box2` is destroyed here, and memory gets freed
}

为了不获得对象所有权的情况下来使用对象,Rust 通过借用操作(borrow/mut borrow)来获得对象的引用。或者通过引用计数类型,如 Rc/Arc 来 clone 对象(不会拷贝堆数据)。

Rust borrow checker 对所有权和借用进行检查,违反时编译报错:

  1. 对象可以多次共享借用,但是只能一次可变借用。
  2. 可变借用的对象本身必须是可变的,即只能对 mut 对象进行 &mut 可变借用,或从已有 &mut 借用生成新的 &mut 借用;
  3. &mut T 可以自动协变(coerced into)到 &T 类型,所以在需要 &T 的地方可以传入 &mut T 类型值,但是反过来不行。
    • 如果函数的参数类型是 trait bound,则不会进行 &mut T 到 &T 的协变,必须传入 &T 变量。
  4. 对象在存在借用的情况下, 不能被修改或 move ;《== 借用冻结
use std::str::FromStr;

fn main() {
    let mut a = 123;
    let ar = &a;
    a = 456; // Error:cannot assign to `a` because it is borrowed,借用冻结
    println!("{ar}")

    let mut s = String::from_str("new string").unwrap();
    let sm = &mut s; // s 本身必须是 mut 类型才能被 &mut,在有 &mut 的情况下,原始值 s 不能再被访问。
    // println!("Result: {s} {sm}"); // cannot borrow `s` as immutable because it is also borrowed as mutable
    println!("Result: {sm}"); // OK

    let s2 = &mut String::from_str("new string").unwrap();
    // let sm2 = &mut s2; // s2 不是 mut 类型,不能被 &mut;
    s2.push_str(" abc"); // s2 虽然不是 mut 类型,但本身是 &mut,所以可以修改;
    println!("s2: {s2}");

    let s3 = s2; // s3 也是 &mut 类型, 也可以修改
    s3.push_str(" def");
    println!("s3: {s3}");

    // let s4 = *s3; // cannot move out of `*s3` which is behind a mutable reference

    // 在已经 &mut T 的情况下,原始值不能再被访问(不能被 move 和修改):
    let mut s = String::from_str("new string").unwrap();
    let sm = &mut s; // s 本身必须是 mut 类型,才能被 &mut;
    s.push_str("abc"); // Error: 在 sm 后续继续使用的情况下,原来的 s 不能再使用。
    // 不能同时使用 s 和 sm。
    println!("Result: {s} {sm}"); // cannot borrow `s` as immutable because it is also borrowed as mutable

    struct MyStruct(u8, String);
    let mut ms = MyStruct(3, "test".to_string());
    let msm = &mut ms;
    let msm2 = &mut msm.1; // 可以从 &mut 创建出另一个 &mut
    msm2.push_str(" def");
}

在转移对象的所有权时可以改变它的可变性(因为转移到的变量 owned 该对象):

fn main() {
    let immutable_box = Box::new(5u32);

    println!("immutable_box contains {}", immutable_box);

    // Mutability error
    //*immutable_box = 4;

    // *Move* the box, changing the ownership (and mutability)
    let mut mutable_box = immutable_box;
    println!("mutable_box contains {}", mutable_box);

    // Modify the contents of the box
    *mutable_box = 4;
    println!("mutable_box now contains {}", mutable_box);
}

不支持通过借用(无论是可变还是共享借用)来对对象的所有权进行转移,例如 let v2 = *V ,解决办法:

  1. V 如果实现 Copy/Clone trait,则可以调用对应的方法;
  2. 使用 std::mem::replace(&dest, src) 将 src 值替换 dest,同时返回 dest 的值:
struct Buffer {
    buffer : String,
}
struct Render {
    current_buffer : Buffer,
    next_buffer : Buffer,
}
impl Render {
    fn update_buffer(& mut self, buf : String) {
        // error[E0507]: cannot move out of `self.next_buffer` which is behind a mutable reference
        // move occurs because `self.next_buffer` has type `Buffer`, which does not implement the
        // `Copy` trait
        self.current_buffer = self.next_buffer;
        self.next_buffer = Buffer{ buffer: buf};
    }
}

// OK 的例子,这里没有使用 &/&mut, 而是直接使用对象的变量 p 来转移 move 对象,这是 OK 的。
#[derive(Debug)]
struct Person {
    name: String,
    email: String,
}
fn main() {
    let  mut p = Person{name: "zzz".to_string(), email: "fff".to_string()};

    let _name = p.name; // struct 允许部分 move
    println!("{} {}", _name, p.email); // 可以访问没有 move 的 struct 成员
    println!("{:?}", p); //不允许访问部分 move 的 struct 整体
    p.name = "Hao Chen".to_string();
    println!("{:?}", p); // OK
}

// OK 的例子,使用 std::mem::replace() 替换
fn update_buffer(& mut self, buf : String) {
    self.current_buffer = std::mem::replace(&mut self.next_buffer, Buffer{buffer : buf});
}

// OK 的例子:也可以使用 std::ptr::read/write 来临时转义借用对象的内容
//
// ptr::read(src: *const T) -> T会从src指针处获取要复制的内容(假定是T类型实例),然后通过**”浅复制
// “**的方式,复制一份新实例,并返回。
//
// ptr::read(src: *const Buffer)会从src指针copy Buffer结构体内容 到tmp(*mut Buffer)处;而Buffer内部
// buffer是String类型,非copy类型,此时tmp内的buffer与src内
unsafe {
    // result 中的 buffer(String),实际上跟dest中的buffer指向共同一块区域
    let result = ::std::ptr::read(dest);
    ::std::ptr::write(dest, src);
    result
}

Reborrow:如果以前的 &mut 变量 r 不再使用,则可以使用 &*r 来获取新的 reborrow;

#[derive(Debug)]
struct Point {
    x: i32,
    y: i32,
}

impl Point {
    fn move_to(&mut self, x: i32, y: i32) {
        self.x = x;
        self.y = y;
    }
}

fn main() {
    let mut p = Point { x: 0, y: 0 };
    let r = &mut p;
    // let p2 = *r // 错误:Point 未实现 Copy,不能通过 *r 来进行转移值

    let rr: &Point = &*r; // reborrow
    println!("{:?}", rr); // Reborrow ends here, NLL introduced

    // Reborrow is over, we can continue using `r` now
    r.move_to(10, 10);
    println!("{:?}", r);
}

Rust borrow checker 将变量视为所有权树的根,所以如果要修改对象的成员(如 struct field)、成员的子对象等状态,一般是使用 &mut self,这样可以将 &mut 引用从对象的根传递到 最内层对象 。这里的修改包括 3 方面:

  1. 对对象本身进行修改;
  2. 对对象的成员进行修改;
  3. 调用对象或成员的方法,这些方法会改变对象的状态和内部字段等。

Rust 提供了 内部可变性 机制,来让使用共享借用 &self 的方法修改 Cell/RefCell/Mutex/Rwlock 等对象的内部状态。

let data = Arc::new(Mutex::new(0)); // data 不可变。
let mut data = data.lock().unwrap();  // MutextGuard 支持内部可变性,故可以获得 mut data。
*data += 1;

在解构 struct/tuple 对象时, 可以 by-move(默认)或 by-refer(需要添加 ref/ref mut):

  • ref/ref mut 表示获得对象的借用,在表达式左侧使用 ref/ref mut 相当于表达式右侧的 &/&mut 操作,对应的变量是 &/&mut 类型。
  • by-move 可能造成 struct/tuple 的 field 被 partial move ,这些 field 后续不能再访问,但是未被 partial move 的字段 还是可以访问的
  • Vec/Array 等 容器类型不支持元素的 partial move, 元素需要实现 Copy 或者被 std::mem::replace。
// https://practice.course.rs/ownership/ownership.html
fn main() {
    #[derive(Debug)]
    struct Person {
        name: String,
        age: Box<u8>,
    }

    let person = Person {
        name: String::from("Alice"),
        age: Box::new(20),
    };

    // `name` is moved out of person, but `age` is referenced
    let Person { name, ref age } = person;
    println!("The person's age is {}", age);
    println!("The person's name is {}", name);

    // Error! borrow of partially moved value: `person` partial move occurs
    //println!("The person struct is {:?}", person);

    // `person` cannot be used but `person.age` can be used as it is not moved
    println!("The person's age from person struct is {}", person.age);
}

// Vec/array 中的元素不支持 partial-move
struct Person { name: Option<String>, birth: i32 }
let mut composers = Vec::new();
composers.push(Person { name: Some("Palestrina".to_string()), birth: 1525 });
// 编译出错
let first_name = composers[0].name;

变量声明时 mut 的位置差异:

let mut data1: Vec<i32> = vec![1, 2]; vs let data2: &mut Vec<i32> = &mut vec![1, 2];
  1. data1 作为一个 mut 变量, 值是可变的, 支持 data1.push(2), 也支持 &mut data1;
  2. data2 不是 mut 变量, 所以不能修改 data2 本身, 但是 &mut 类型, 所以借用的值是可变的, 支持 data2.push(2), 但是不支持 &mut data2;

解引用操作符 * 用于返回引用类型对象的值。由于 Rust 编译器会自动解引用和生成引用,所以实际很少直接通过 * 操作符来显式解引用:

  1. 通过 . 操作符访问对象的成员时,Rust 自动解引用: ref.filed 等效于 (*ref).field ;
  2. 通过 . 操作符调用方法时,如果方法的第一个参数式 &self 或 &mut self, 则 自动借用对象来生成引用 ,然后传递给对应的方法:v.sort() 等效为 (&v).sort(),称为 methdo call deref coercion.
// 使用 . 访问引用对象成员时,自动解引用
struct Anime { name: &'static str, bechdel_pass: bool };
let aria = Anime { name: "Aria: The Animation", bechdel_pass: true };
let anime_ref = &aria;
assert_eq!(anime_ref.name, "Aria: The Animation");
// Equivalent to the above, but with the dereference written out:
assert_eq!((*anime_ref).name, "Aria: The Animation");

// 对象方法是 &self 或 &mut self 时自动生成对象的引用
let mut v = vec![1973, 1968];
v.sort(); // implicitly borrows a mutable reference to v
(&mut v).sort(); // equivalent, but more verbose

// 但是直接使用 ref 变量时,需要手动解引用
let x = 5;
let y = &x;
assert_eq!(5, *y); // y 需要解引用

其他自动解引用场景(都支持 多级自动解引用 ):

  1. 比较操作,所以默认情况下比较的是引用的值,而非引用本身(指针)。
  2. index 操作符。
  3. 算术运算操作;
  4. println!()/assert* 等宏函数自动解引用传入的引用参数:
struct Point { x: i32, y: i32 }
let point = Point { x: 1000, y: 729 }; let r: &Point = &point;
let rr: &&Point = &r;
let rrr: &&&Point = &rr;
assert_eq!(rrr.y, 729); // . 操作支持多级解引用

let x = 10; let y = 10;
let rx = &x; let ry = &y;
let rrx = &rx; let rry = &ry;
assert!(rrx <= rry);  // 比较操作也自动多级解引用
assert!(rrx == rry);

fn factorial(n: usize) -> usize {
    (1..n+1).product()
}
let r = &factorial(6);
assert_eq!(r + &1009, 1729); // 算术运算自动解引用。

let v1: Vec<i32> = vec![1, 2, 3];
v1.iter().map(|x| x + 1); // 算术运算自动解引用

Rust 引用操作可以是任意表达式,如字面量,Rust 会自动进行转换(coercion) Type coercions - The Rust Reference

r + &1009

let _: &i8 = &mut 42;

fn bar(_: &i8) { }
fn main() {
    bar(&mut 42);

    let x = 5;
    let y = &x;
    assert_eq!(5, y);
    println!("Success!");
}

对于 T,&T,&mut T,Box<T> 在进行 Display 时显示的都是 T 的值。&T, &mut T, Box<T> 实际是指针类型,可以使用 p 修饰符来 显示它们的地址而非值

  • Rc::clone() 由于不会发生内存拷贝,而只是增加了引用计数,所以产生的对象与以前的对象是 相同的地址
  • 如果要比较引用地址本身,需要使用 std::ptr::eq 函数,使用 {:p} 来打印指针地址:
use std::rc::Rc;

fn main() {
    let mut t = 123;
    let tp = &mut t;
    let tpp = &tp; //  从 &mut T 变量中可以再借用出共享引用 &T
    println!("{:p} {:p}",  tp, tpp); // tp 和 tpp 是两个不同类型的变量,所以地址不一致

    let rc = Rc::new(String::from("abc"));
    let rc2 = rc.clone();

    // rc: abc, rc2: abc, rc pointer:0x600001bdc2b0, rc2 pointer 0x600001bdc2b0
    // 可见 rc 和 rc2 内存的地址都是一样的,说明 Rc clone 没有发生堆内存拷贝。
    println!("rc: {}, rc2: {}, rc pointer:{:p}, rc2 pointer {:p}", rc, rc2, rc, rc2);
}

use std::ptr;
let five = 5;
let other_five = 5;
let five_ref = &five;
let same_five_ref = &five;
let other_five_ref = &other_five;
assert!(five_ref == same_five_ref); // 比较操作时,自动多级解引用,所以比较的是值。
assert!(five_ref == other_five_ref);
assert!(std::ptr::eq(five_ref, same_five_ref)); // std::ptr::eq 比较地址
assert!(!std::ptr::eq(five_ref, other_five_ref));

15 lifetime
#

Rust 给每一个 引用类型 对象设置一个 lifetime(自动或手动),如函数的输入和输出参数,函数内的变量,全局变量,struct/enum 成员等。设置 lifetime 的目的是指导 Rust borrow checker 对程序各部分借用的对象的引用的生命周期进行检查,发现异常时编译报错。

let b = &'a dyn MyTrait + Send + 'static; // error: expected expression, found keyword `dyn`
let b = &'a(dyn MyTrait + Send + 'static); // error: borrow expressions cannot be annotated with lifetimes

lifetime 只是一个编译时的注解, 没有运行时代表 ,也不能在表达式中使用。lifetime 表达的是一个 相对的概念 和约束, Rust borrow checker 根据 lifetime anno 来检查引用是否有效:

  1. <T: ‘b> :表示 T 的引用的生命周期比 ‘b 长。
    • &‘b T 隐式表示 T: ‘b, 即 T 的生命周期要比 ‘b 长。
  2. <T: Trait + ‘b> :表示 T 要实现 Trait 且 T 的生命周期比 ‘b 长。
  3. struct foo<‘a: ‘b, ‘b,T: ‘b> (val1: &‘a String, val2: &‘a String, val3: &‘b String, val4: &T):
    • <‘a: ‘b, ‘b>:表示 ‘a 的生命周期比 ‘b 长;
    • val1 和 val2 的生命周期一样长, 且比 val3 的生命周期长;
    • val4 的生命周期要比 ‘b 长,即 val4 的生命周期要比 val3 长;
    • foo 对象的生命周期不能长于 ‘a 和 ‘b;
    • 注意上面 ‘a 和 ‘b 的顺序和语法,错误的情况:<‘a, ‘b, ‘a: ‘b>;
  4. fn print_refs<‘a: ‘b, ‘b>(x: &‘a i32, y: &‘b i32) -> &‘b String
    • 函数执行期间 ‘a, ‘b 的引用要一直有效,即 ‘a 和 ‘b 的生命周期比函数长;
    • ‘a: ‘b 表示 ‘a 的生命周期比 ‘b 长,所以 x 的生命周期要比 y 长;
    • 返回值的生命周期要和 y 一样长;

lifetime 作为泛型参数时,必须位于其他泛型参数之前,比如 <‘a, T, T2>:

fn add_ref<'a, 'b, T>(a: &'a mut T, b: &'b T) -> &'a T
  where
      T: std::ops::Add<T, Output = T> + Copy,
  {
      *a = *a + *b;
      a // OK
      // b // Error: b 的声明周期是 'b 与返回值的声明 'a 不一致。function was supposed to return data with lifetime `'b` but it is returning data with lifetime `'a`
  }

#[derive(Debug)]
struct Ref<'a, T: 'a>(&'a T); // 等效于 struct Ref<'a, T: 'a>(&T);

// `Ref` contains a reference to a generic type `T` that has an unknown lifetime `'a`. `T` is
// bounded such that any *references* in `T` must outlive `'a`. Additionally, the lifetime of `Ref`
// may not exceed `'a`.

// Here a reference to `T` is taken where `T` implements `Debug` and all *references* in `T` outlive
// `'a`. In addition, `'a` must outlive the function.
fn print_ref<'a, T>(t: &'a T) where T: Debug + 'a { // 等效于 fn print_ref<'a, T>(t: &T) where T: Debug + 'a {
    println!("`print_ref`: t is {:?}", t);
}
fn main() {
    let x = 7;
    let ref_x = Ref(&x);
    print_ref(&ref_x);
    print(ref_x);
}

如果泛型类型需要 lifetime 参数,但是在实现某个 Trait 时该 Trait 的方法并不需要该 lifetime 参数或则编译器可以自动推断,则可以使用 <’_>:

impl<'a> Reader for BufReader<'a> {
    // 'a is not used in the following methods
}
// can be written as :
impl Reader