## Rust: safe and unsafe as theorems and axioms

There is a fair amount of confusion about what `unsafe`

means in Rust, as well as debate about how one should think about it. Recently I’ve seen several blog posts like What is Rust’s unsafe?, The Temptation of Unsafe and Unsafe as a Human-Assisted Type System.

So I’ll ~~add yet more useless philosophizing~~ provide another perspective.

I’m not really attempting to explain what is considered unsafe in Rust, which is explained by the reference. Nor am I going to try to answer the question of precisely when unsafe should be used and how often.

My basic suggestion: we can think of `unsafe`

in terms of mathematical axioms and theorems. This understanding is somewhere in between actual mathematical rigour and an analogy.

### Terminology

**Unsafe block**

An unsafe block,`unsafe {}`

allows potentially unsound actions like dereferencing a raw pointer or calling an “unsafe function”. Code attempting these actions outside an unsafe block or function fails to compile. The code may (and should) be sound, but the compiler cannot verify this.**Unsafe function**

Declared with`unsafe fn`

, an unsafe function can perform all the actions that are possible in an`unsafe {}`

block, but is also unsafe to call.**Soundness**

“Unsound” code is any code triggering undefined behavior. Undefined behavior is a rather complicated concept that is important to understand especially for C programmers, but also C++ programs and anyone using the`unsafe`

keyword in Rust. This series of LLVM blog posts is helpful.Safe code should always be sound (barring compiler bugs and calls to incorrectly implemented functions containing unsafe blocks). Correct uses of

`unsafe`

are also sound. So “unsound” code is unsafe code that fails to correctly uphold the invariants safe code relies on.Other terms are used; often just “safe”. But it gets a bit consfusing to talk about things like “safe uses of unsafe”. And more importantly, it’s then ambigous to say how “doing $x is unsafe”, which could mean $x it requires

`unsafe`

and the programmer must maintainer certain invariants, or that $x always results in undefined behavior.

### Safe code

One important subgoal of programming in Rust is to proof your program is sound and free of undefined behavior. (You also want to verify that behavior is actually the behavior you want, performance is acceptable, etc.).

The glory of Rust is that you can “prove” this, as long as all your code from `main()`

down doesn’t use `unsafe`

.

```
fn main() {
let mut v = Vec::new();
v.push(42);
}
```

We can think of this as going along with a proof:

**Theorem:** `main()`

is sound

**Proof:**

- Assume
`Vec::new()`

and`Vec::push()`

are sound - If
`main()`

compiles, it follows from the compiler’s checks that`main()`

is sound

This proof is valid as long as we have corresponding proofs for `Vec::new()`

and `Vec::push()`

, and assume the compilers rules for verifying the soundness of safe code are correct. The proof is rather short since it’s core argument is simply rustc said so. We’ll consider the implications of this reasoning later.

By writing `fn main()`

without the `unsafe`

keyword, and not using an `unsafe {}`

block in the body, and assuming the code compiles, we have written a proof that the `main()`

function is sound. This applies equally to other functions we might write and call (directly or indirectly) from `main()`

.

### Functions with unsafe blocks

But… at some point you are almost certainly dependent on unsafe code. At least in the standard library.

Consider `Vec::push`

, for instance:

```
pub fn push(&mut self, value: T) {
// This will panic or abort if we would allocate > isize::MAX bytes
// or if the length increment would overflow for zero-sized types.
if self.len == self.buf.cap() {
self.reserve(1);
}
unsafe {
let end = self.as_mut_ptr().add(self.len);
ptr::write(end, value);
self.len += 1;
}
}
```

We use `unsafe {}`

when the compiler cannot prove the soundness of our code. So it becomes an axiom.

**Axiom:** `Vec::push()`

is sound

Thus our “proof” for the soundness of `main()`

relies on this axiom. This is only true if the `ptr::read`

call is correct; but the compiler isn’t able to verify this, and relies on the programmer to assert that it is.

### Unsafe functions

Now let’s instead consider an `unsafe fn`

; in particular, `Vec::set_len()`

:

```
pub unsafe fn set_len(&mut self, new_len: usize) {
debug_assert!(new_len <= self.capacity());
self.len = new_len;
}
```

This defines neither a theorem nor axiom. Generally, the fact that it’s an `unsafe fn`

suggests some uses are sound and others are not; without clarifying what uses or allowing the compiler to check this.

But the documentation does provide what could be considered an axiom:

Safety

`new_len`

must be less than or equal to`capacity()`

.- The elements at
`old_len..new_len`

must be initialized.

Or, restated:

**Axiom:** Suppose `new_len <= capacity()`

. Suppose elements at `old_len..new_len`

are initialized. Then it follows that `unsafe { v.set_len(new_len); }`

is sound.

Given this axiom, we can “prove” our use of `set_len`

is correct; but the compiler cannot verify these requirements.

### Ideal world: encoding and proving invariants in Rust

Ideally we’d be able to encode these requirements in Rust. Then our caller could provide some proof that the invariants are met, and call the function without using `unsafe {}`

. Similarly, our `set_len()`

and other `Vec`

methods could have soundness proofs, assuming the stated constraints are met.

Thus, use of `unsafe {}`

could be limited further to only cases that are correct but impossible to prove (under the provided system) or where it would be too difficult to prove.

I’m not the first one to think such a feature would be nice, but it would be rather complicated. So for now, we rely on the programmer verifying these properties themself. It’s probably not something the language will even have, but perhaps some external tool could provide some form of this for users who require it.

But without that, it’s still important to include this information in the documentation. A well-written library should provide an “axiom” of this sort in the documentation of any `unsafe fn`

(at least in the public API). Rust’s ability to encapsulate unsafely isn’t much use if it isn’t possible to determine what uses of the function are sound.

### Implicit axioms

Our programs are dependent on `unsafe`

code whether or not we realize it: at least in the standard library. As we see above with `Vec`

, the standard library contains `unsafe {}`

blocks, so our list of axioms include those defined by these usages.

Perhaps less obviously, the language itself also implicitly defines various axioms. The exact behavior doesn’t have a mathematical specification, but we could list some easily enough:

**Axiom:** It is sound to consecutively execute two sound blocks of code

**Axiom:** In safe code, the lifetime checker correctly prevents lifetime violations

**Axiom:** An if statement is sound if the condition, body, and else block are sound

And we could continue, until we have a fairly large but finite set of axioms.

A soundness bug in the standard library or compiler thus can make or program unsound. This is rare, but occasionally happens.

### Limitation: unsafety extends beyond function block

Our model is made a little less neat by a common subtly of unsafety, as mentioned in the Nomicon:

Because it relies on invariants of a struct field, this unsafe code does more than pollute a whole function: it pollutes a whole module. Generally, the only bullet-proof way to limit the scope of unsafe code is at the module boundary with privacy.

The example there is different, but applies to the implementation of `Vec`

as well. Which you might have been wondering about, if you noticed that the implementation for `set_len()`

above (unlike many uses of `unsafe fn`

) would still compile without the `unsafe`

keyword. All it does is set the `.len`

member, with fairly ordinary safe code.

But this could lead to problems in combination with other code. The `push()`

code above would be unsound if, for instance, another function had set `self.len`

to a bogus value. So even though setting `.len`

is safe, carelessly changing it could cause methods like `.push()`

to produce undefined behavior.

So my neat idea that a function can correspond one-to-one with a soundness theorem or axiom isn’t quite perfect, at least without more assumptions, since different methods of a type like `Vec`

interact in ways that could allow safe code in one to trigger undefined behavior in the unsafe code of another. But perhaps the idea is still somewhat useful.

### Limitation: not formally specified/verified

Rust does not have a formal specification, so there is no official list of axioms like this. For true mathematic rigor, we would also need a formally verified compiler proven to generate code conforming to these axioms. Computer-verified proofs can be done written and checked software like Coq, which provides an elaborate type system that can prove theorems using the Curry-Howard correspondence.

There have been efforts toward formal verification in Rust. But a full formal specification and formally verified compiler, especially one that is also performant, probably isn’t really feasible.

This doesn’t invalidate the general reasoning here, but we have to be careful not to be too bold to claim things are really “proven” in a rigorous mathematical sense.

### Conclusion

In an ideal world, a major aim of compilers is proof checking. The type checker proves that types are used correctly. The optimizer creates smaller, more efficient code and proves it’s behavior is equivalent. The lifetime checker in Rust proves the liveness of references.

In practice the mathematical rigor involved here is lower than in an ideal (and unrealistic) world. But we can still reason about these things in terms of the concept of mathematical proof.