The way I like to serialize data in Rust into binary formats is to let a data structure blit itself into a mutable buffer. This is a relatively composable, low level way to work that lends itself to having other abstractions built on top of it. I recently was serializing network packets, so let's make up a small packet format that illustrates how we can do this.
+-------------+-------------+
| Tag (u16) | Count (u16) |
+-------------+-------------+
| |
~ Entry (u32) ~
| |
+---------------------------+
We'll serialize this out of a packet struct like this:
pub struct Packet {
tag: u16,
entries: Vec<u32>,
}
In order to serialize this, we need 4 + count * 4
bytes of space to work with.
If we were working with a Vec<u8>
then we would reserve that much space and work, but doing so means that the user of our API has less control over allocations.
Instead, we can work with a &mut [u8]
buffer.
Now, to serialize this way, we'll work with two methods which I'll define as a trait for now:
pub trait BinarySerialize {
fn needed_size(&self) -> usize;
fn serialize(&self, buffer: &mut [u8]) -> Result<usize, ()>;
}
The needed_size()
method tells the minimum size necessary for serializing that instance.
The serialize()
method has a few more complicated requirements, it takes a &mut [u8]
buffer that will be serialized into, and it returns a Result<usize, ()>
.
This if the buffer that you passed in is too small, then you return an Err(())
.
In the case that serialize()
succeeds, you return Ok(size)
, where size
is the size that ended up getting written to the buffer.
This should probably always exactly correspond to the value that would be returned by needed_size
.
Now, for our packet here, we can implement this protocol.
I'm going to assume that we've implemented BinarySerialize
for u16
and u32
(I'll do it at the bottom of the file), although in a real implementation you'd probably want to write your numbers into the buffer using a different method like the byteorder
crate etc.
impl BinarySerialize for Packet {
fn needed_size(&self) -> usize { 4 + 4 * self.entries.len() }
fn serialize(&self, buffer: &mut [u8]) -> Result<usize, ()> {
if buffer.len() < self.needed_size() {
return Err(());
}
try!(self.tag.serialize(&mut buffer[0..2]));
// I'm ignoring the possibility that self.buffer.len() overflows a u16 here...
try!((self.buffer.len() as u16).serialize(&mut buffer[2..4]));
for (mut into, value) in buffer[4..].chunks_mut(4).zip(self.entries.iter()) {
try!(value.serialize(&mut into));
}
Ok(self.needed_size())
}
}
Once you have this, you can serialize into any slice of bytes you have that's large enough.
This might be a shared buffer that you allocated beforehand, or it might be a Vec
allocated on the spot.
We can write that a higher level serialization API on top of our lower level one, like this:
impl Packet {
/// Creates a new `Vec<u8>` with the serialized contents of `self`.
pub fn serialize_vec(&self) -> Vec<u8> {
let mut buffer = vec![0; self.needed_size()];
self.serialize(&mut buffer)
.expect("we made sure to allocate needed_size");
buffer
}
/// Fills `buffer` with the serialized contents of `self`, enlarging
/// `buffer` if necessary.
pub fn serialize_into_vec(&self, buffer: &mut Vec<u8>) {
buffer.resize(self.needed_size(), 0);
self.serialize(buffer)
.expect("we made sure to reserve needed_size");
}
}
NOTE: Don't do this in actual code, this is so I can make a minimum working example! There are much better ways to write numbers into a buffer of bytes. Ones that are guaranteed to optimize better, I haven't looked at what code this will generate in the case that everything is in-bounds where all the size checks should be able to be elided.
impl BinarySerialize for u16 {
fn needed_size(&self) -> usize { 2 }
fn serialize(&self, buffer: &mut [u8]) -> Result<usize, ()> {
if buffer.len() < self.needed_size() {
return Err(());
}
buffer[0] = (self >> 8) as u8;
buffer[1] = (self >> 0) as u8;
Ok(self.needed_size())
}
}
impl BinarySerialize for u32 {
fn needed_size(&self) -> usize { 4 }
fn serialize(&self, buffer: &mut [u8]) -> Result<usize, ()> {
if buffer.len() < self.needed_size() {
return Err(());
}
buffer[0] = (self >> 24) as u8;
buffer[1] = (self >> 16) as u8;
buffer[2] = (self >> 8) as u8;
buffer[3] = (self >> 0) as u8;
Ok(self.needed_size())
}
}