See the Elixir binary pattern docs
- Consistency between code bases
- Conciseness of patterns
- All integers are specified with the fields in
<endian>-<signed>-<size>-<unit>
order - Defaults are not specified (
big
,unsigned
,size(8)
,unit(1)
) - For constant sizes, the size is appended to the end like
-4
or-16
. Nosize(4)
orsize(16)
- Do not use
unit()
for constant sizes less than 64-bits. E.g.,-32
instead of-4-bytes
or-4-unit(8)
or-size(4)-unit(8)
- Do not use computations in bit sizes. E.g.,
-32
instead of-4*8
. - When using
size(n)
for <= 64 bit fields wheren
is not a constant, stay with the default unit size of 1. - When using
size(n)
for larger bit fields and it's possible to use bytes, specify the size in bytes and append-unit(8)
. An exception is when working with algorithms that prefer bit sizes like cryptographic ones (SHA-256, AES-512, etc.) - Avoid using
unit(n)
wheren
is not 8.
- When using ASCII strings, do not include any specifiers. If the field is UTF-8, be explicit by appending
::utf8
- Use
n-bytes
when matching fixed length binary fields andn
is a constant integer. - Use
binary-size(n)
when matching fixed length binary fields andn
is a variable.
This is based on it being more common to talk about those sizes in bits rather than bytes. For example, 32-bit integers instead of 4-byte integers. Other programming languages use bits for these sizes as well. E.g., int32_t
in C.
Once the number of bits gets to be too large, it's frequently more natural to use bytes. This is allowed, but isn't required due to notable exceptions when working with cryptographic algorithms.