Unsafe can't ensure performance improvements, but you can sometimes get performance boosts using it.
It should be a last resort if you have a bottle neck that is a result of memory restrictions.
It's also how FFI works. If you're calling a c library, you cannot guarantee memory safety anymore since its outside of rusts domain, so it needs to be unsafe.
> Unsafe can't ensure performance improvements, but you can sometimes get performance boosts using it.
Maybe to back this up with a concrete example, let's say I'm parsing something and I have a &[u8] (a string of bytes) which I have already verified contains only ASCII digits. I wanted to get the numerical value of that number. I don't want to write my own number parsing code, but std only has a parsing function for &str (UTF-8 strings), not for &[u8]. I could do
let string = str::from_utf8(bytes);
let value: u64 = string.parse()?;
But then str::from_utf8 would iterate over the bytestring again to verify that it's valid UTF-8. This check is useless because I already know the string only contains ASCII digits. So in this case, I can improve performance with an unsafe block:
//SAFETY: `bytes` was already proven to only contain ASCII digits
let string = unsafe { str::from_utf8_unchecked(bytes) };
let value: u64 = string.parse()?;
The performance gain comes not from unsafe per se, it comes from using a different function that skips the UTF-8 check. Since this function does not guarantee on its own that str's invariants are upheld, it's marked unsafe.
The intention is to extract as much valid &str from the given buf as possible, and only fail if there is no more valid &str to read from the buf. Unfortunately std::str::from_utf8's error does not also yield a &str corresponding to the part that did parse successfully ( * ), so I have to compute it myself. Using std::str::from_utf8 for this second pass unfortunately does end up verifying the UTF8-ness of the buf again, as verified from the asm, because the compiler isn't sufficiently smart. Similarly slicing buf with valid_up_to safely also reruns the bounds check, which is why the code also uses get_unchecked for that instead.
( * ) Nothing prevents it from doing that, and one of these days I might get around to PRing it.
Isn't this a bit fragile though? It's totally fine as long as you own and understand the code. But what's to stop someone new and/or inexperienced from coming along and violating the 'already proven' claim? How will you know?
Isn't there a risk of Rust code having "Don't ever change this" comments?
That's a general problem of coding, that the next edit can violate an invariant. The advantage with Rust is that, for several types of invariants, you only need to check places with an unsafe block during code review or audits. Everywhere else, the compiler upholds these invariants for you.
Why not lift the encoding to the type system? The type system could remember which strings are ASCII and which are UTF-8. More generally, why not lift such proofs to types?
It should be a last resort if you have a bottle neck that is a result of memory restrictions.
It's also how FFI works. If you're calling a c library, you cannot guarantee memory safety anymore since its outside of rusts domain, so it needs to be unsafe.