The Arcmem heap can be partially serialized to a streamable format, here called .arc
(spoken as "dot-arc"). This format is not a file format per se, but can be used as the payload of one.
Each .arc
describes heap objects in arbitrary order, with the requirement that each object must be declared, in full or as forward declaration, before it can be referenced. An object may only be fully declared once, and forward declared once only before the full declaration appears.
By convention, the first element of a stream is considered to be the root object by which all other objects are reachable.
The header of a heap declaration looks as follows:
{ objectid + size - 1 : u48, unused : u15, is_forward_decl : u1 } : u64
This header does in fact reserve a uniquely owned object address range, from objectid
to objectid + size
excluded. No two object address ranges may overlap. objectid
must be aligned to 2^ceil(log2(size))
. Bits 37 .. 42
are used to encode the number of offset bits required, with 0
mapping to at least 5
bits, and 31
mapping to 36
bits. By convention, bit 42
is always set to 1
.
When encoding, size - 1
is OR-ed into the zero bits of objectid
. objectid
and size
can be decoded as follows:
objectid := p & -(32:u64 << ((p >> 37) & 31))
size := p - objectid + 1
If is_forward_decl
is 1
, then no further data follows for this object. The full declaration of the object must then appear later in the stream. Otherwise, the unaligned data for this object follows:
pointer_bitmap : u8[size // 8]
data : u8[size]
pointer_bitmap
linearly maps to data
at a ratio of 1 to 64 bits, indicating which 64-bit aligned values in data
are in fact pointers to other objects (tag set to 1
), rather than just a binary blob (tag set to 0
).
Values tagged as pointers can either be 0:u64
, or must be in the range of any other previously declared object. If bit 63
is set, the value is to be considered a weak reference rather than a strong one.