Feature Name: compatible_trait
Start Date: YYYY-MM-DD
RFC PR: rust-lang/rfcs#0000
Rust Issue: rust-lang/rust#0000

Summary

This proposal introduces two traits, Compatible and TryCompatible, which allows specifying safe mem::transmute relationships and performing them in an ergonomic way.

Motivation

Design goals and constraints

These design goals and constraints are derived from the examples below:

performance: users that do not care about performance can just deconstruct a value into its fields and use those to construct a new value without using any unsafe code. Going from one value to another in a single (or no) memcpy is the main advantage of this feature.
complete: it must be possible to encode safety invariant using this feature, if
- transmute::<T, U> is provably safe,
- transmute::<T, U> is only safe for some values of U.
sound: safe_transmute<T, U> must not type check for a T-U-pair for which a transmute is unsound.
consistent: if safe_transmute::<T, U> is safe, it must be possible to replace that with transmute::<T, U>.
ergonomic: the feature should be more ergonomic than mem::transmute. In particular, if transmute::<T, U> is provably safe, then safe_transmute::<T, U> must just work.

Examples

These examples require unsafe code today but they are safe and should work with safe_transmute and try_safe_transmute:

Example: bi-directional safe transmute

let x: [u8; 4];
let y: u32 = safe_transmute(x);
let z: [u8; 4] = safe_transmute(y);

Example: uni-directional transmute

let x: bool = true;
let y: u8 = safe_transmute(x);

Example: try transmute

let x: u8 = 1;
let y: bool = try_safe_transmute(x)?; // Ok
let x: u8 = 42;
let y: bool = try_safe_transmute(x)?; // Err

Example: slice

let a: [i32; 2] = [1, 2];
let x: &[i32] = &a;
let y: &[i32; 2] = try_safe_transmute(x)?; // Ok
let y: &[i32; 3] = try_safe_transmute(x)?; // Err
let y: &[i32; 1] = try_safe_transmute(x)?; // Ok or Err ? (leaks one element)

Example: `try_safe_transmute` cannot be collapsed

let x: &[u32];
let y: [u8; 3] = try_safe_transmute(*x)? // Err(LengthErr)
let z: [bool; 3] = try_safe_transmute(y)?; // Err(InvalidValue)

Example: references

#[repr(C)] struct S { x: u8, y: u16 }
let x: &mut S;
let y: &mut u8 = safe_transmute(x);

#[repr(C)] struct S { x: u16, y: u16 }
let x: &mut S;
let y: &mut u8 = safe_transmute(x);
let y: &mut u32 = safe_transmute(x);
let y: &mut [u8; 4] = safe_transmute(x);
// note: &mut [u8] does not fit, size larger than &mut S

Example: sound "if fits" transmutes

If the transmute compiles (because both types have the same size), these are sound:

struct X(u8, u8);
transmute::<[u8; 2], X>();

struct Y(u8, u16);
transmute::<MaybeUninit<[u8; 4]>, Y>();

transmute::<*mut T, *mut U>(T);

Example: unsound transmutes

These transmutes are not sound even if they compile:

transmute::<bool, u8>(); // unsound validity

struct X(u8, MaybeUninit<u8>);
transmute::<[u8; 2], X>(); // unsound: validity

struct Y(u8, u16);
transmute::<[u8; 4], Y>(); // unsound: validity (due to padding)

#[repr(C)] struct Z(u8, u16);
transmute::<[u8; 4], Z>(); // unsound validity (due to padding)

union U { u: [u8; 4] } 
transmute::<[u8; 4], U >(); // unsound validity (due to union validity)

#[repr(C)] struct S { x: u8, y: u16 }
transmute::<&mut [u8; 4], &mut S>(); // unsound validity 
transmute::<&mut [u8], &mut S>(); // unsound validity 
transmute::<&[u8; 4], &S>(); // unsound validity 
transmute::<&[u8], &S>(); // unsound validity

TODO: more examples

Constraint

Guide-level explanation

Given a user defined type:

#[repr(C)]
struct Dog {
  age: u32,
}

and a value of this type dog: Dog, then all of these are safe:

let bytes: [u8; 4] = safe_transmute(dog.clone());
let also_bytes: (u8, u8, u8, u8) = safe_transmute(dog.clone());
let also_bytes: Simd<u8, 4> = safe_transmute(dog.clone();

and work if the only thing the user implements for Dog is:

unsafe impl Compatible<[u8; 4]> for Dog {}

This allows Dog to be safely transmuted into a [u8; 4], and since [u8; 4] can be safely transmuted into (u8, u8, u8, u8) and Simd<u8, 4>, Dog can also be safely transmuted into these as well.

Changing Dog to:

#[repr(C)]
struct Dog {
  friendly: bool,
}

produces a compilation error:

error[E0XYZ]: types with different sizes cannot be compatible
 --> src/doghouse.rs:1:1
  |
1 | unsafe impl Compatible<[u8; 4]> for Dog
  |                        ^^^^^^^      ^^^
  |
  = note: source type: `Dog` (8 bits)
  = note: target type: `[u8; 4]` (32 bits)

and requires us to change the impl to, e.g., :

unsafe impl Compatible<u8> for Dog {}

Trying to transmute an u8 into a Dog also fails to compile since the bound u8: Compatible<Dog> is not satisfied.

Changing Dog to:

struct Dog {
    friendly: bool,
    age: u32,
}

and the implementation of Compatible to

unsafe impl Compatible<[u8; 8]> for Dog {}

also fails to compile:

error[E0XYZ]: implementation of Compatible violates validity invariant
 --> src/doghouse.rs:1:1
  |
1 | unsafe impl Compatible<[u8; 8]> for Dog
  |                        ^^^^^^^      ^^^
  |
  = note: source type: `Dog` has padding bytes that can be uninitialized
  = note: target type: `[u8; 8]` does not support uninitialized bit-patterns
  = note: change target type to `MaybeUninit<[u8; 8>` instead

and indeed, changing the impl to

unsafe impl Compatible<MaybeUninit<[u8; 8]>> for Dog {}

compiles.

For some types, whether a transmute is safe or not depends on the data contained in the type. These transmutes can be safely attempted using the try_safe_transmute API.

For example, given this definition of Dog:

struct Dog {
  friendly: bool,
}

we can implement:

unsafe impl TryCompatible<Dog> for u8 {
    type Error = ();
    fn try_compatible(self) -> Result<(), ()> {
        if self == 0 || self == 1 {
            Ok(())
        } else {
            Err(())
        }
    }
}

to be able to safely transmute an u8 back to a dog:

let dog: Dog;
let raw: u8 = safe_transmute(dog);
let dog = try_safe_transmute(raw)?;

Notice however, that the following fails to compile

let dog: Dog;
let raw: bool = try_safe_transmute(dog)?;
let dog = try_safe_transmute(raw)?;

Differently than for Compatible, TryCompatible requires an impl Compatible<U> for T to exist. The reason is that for TryCompatible the actual conversion chain picked matters and while one particular chain might not produce any errors, a different chain might.

Reference-level explanation

Add to libcore the following:

// Trait impls
#[lang_item = "compatible_trait"]
pub unsafe trait Compatible<T> {}
pub unsafe trait TryCompatible<T> { 
    type Error;
    fn try_compatible(&self) -> Result<(), Self::Error>; 
}

// Blanket impl of TryCompatible for Compatible:
unsafe impl<U, T: Compatible<U>> TryCompatible<U> for T { 
    type Error = !; 
    fn try_compatible(&self) -> Result<(), !> {
        Ok(())
    }
}

// APIs
pub const fn safe_transmute<T, U>(x: T) -> U where T: Compatible<U> {
    unsafe { const_transmute(x) }
}
pub fn try_safe_transmute<T, U>(x: T) -> Result<U, T::Error> where T: TryCompatible<U> {
    x.try_compatible().map(|_| unsafe { transmute(x) })
}

// Manual unsafe impls for the primitive types: ints, refs, pointers, 
// Non..-types, SIMD, arrays, ...
unsafe impl Compatible<u8> for bool {}
unsafe impl TryCompatible<bool> for u8 {}

unsafe impl<T> Compatible<MaybeUninit<T>> for T {}

// impl for slices
unsafe impl<T> TryCompatible<&[u8; {size_of::<T>()}]> for &[T] 
    where T: Compatible<[u8; {size_of::<T>}>
{
    fn try_compatible(&self) -> bool {
        self.len() >= size_of::<T>() && self.as_ptr() % align_of::<T>() == 0
    }
}

A trait bound of the form T: Compatible<U> is "satisfied" iff given any T: Compatible<U0> there is a type sequence [U_0, U_1, ..., U_N] such that for i in range [1, N) the query U_{i}: Compatible<U_{i+1}> is satisfied and there is a impl of Compatible<U> for U_N. Notice that multiple such sequences could exist, but it suffices that one exists for the query to be satisfied.

Drawbacks

Complicates the trait system (TODO: adds a new kind of query to the trait system).

Rationale and alternatives

Why is this design the best in the space of possible designs?
What other designs have been considered and what is the rationale for not choosing them?

Given N types that can be safely transmuted to u8, and M types that can be safely transmuted from u8, this approach requires N + M trait impls for Compatible. Some alternatives discussed below require N x M trait impls.

This is a big ergonomic downside for users, which either need to provide N x M impls, or perform sequences of transmutes. It also significantly impact compile times.

What is the impact of not doing this?

People will either unnecessarily write unsafe code, or roll their own alternatives which suffer from the N x M problem.

Prior art

This RFC is compared with the following proposals below:

Pre-RFC Safe Transmute v2

Safe Transmute v2 only requires N + M trait implementations, and it provides a very handy ::cast operation, even for fallible conversions, by not allowing users to define their own conversion error types.

This pre-RFC does not satisfy some constraints:

performance: some transmute: * unnecessarily require run-time checks, e.g. bool -> #[repr(transparent)] struct Bool(bool) must be expressed as bool -> [u8] -> Bool where the [u8] -> Bool step requires generating Rust code to perform a run-time check of the validity invariant. These run-time checks might be removed by backend optimizations. * allows ""transmutes"" that are not replacable with a bare std::mem::transmute

completeness: some transmutes are not expressable, e.g., one can't directly express bool -> Bool, and one cannot transmute, e.g., #[repr(C)] struct A(Bool, u16) to #[repr(C)] struct B(bool, u16); due to padding even though transmute::<B, A> is safe.

consistency: some ""transmutes"" allowed / encouraged aren't actually transmutes, e.g., &[u8] -> &T, where &[u8] has a different size than &T, and cannot therefore be replaced with a call to transmute.

`zerocopy` / `FromBits`

This crate and its RFC do not satisfy the constraints in a similar way as the Pre-RFC safe transmute v2.

Pre-RFC FromBits/IntoBits

This pre-RFC does not satisfy some constraints

completeness: it does not allow expressing transmutes that depend on the validity invariant for correctness, e.g., u8 -> bool

ergonomics: it suffers from the N x M problem

Unresolved questions

Should the compiler error on implementations of Compatible for types with an unspecified layout ? (e.g. repr(Rust)).
Not an unresolved question, but it is possible that before stabilization the list of impls of these traits for types within the standard library would increase.

Future possibilities

try_safe_transmute should become a const fn when these support const traits.

gnzlbg/compatible_trait.md