LZF is made of 1 or more "Chunks" that are simply concatenated to each other.
"Z" | "V" | Type |compressed length| original length | payload
+--------+--------+--------+--------+--------+--------+--------+=================
|01011010|01010110|0000000C|LLLLLLLL|LLLLLLLL|BBBBBBBB|BBBBBBBB| L bytes of data
+--------+--------+--------+--------+--------+--------+--------+=================
- C is a flag: 0 = chunk is not compressed, 1 = chunk is compressed.
- L is a uint16 - the length of the chunk after the header.
- B is a uint16 (only present if compressed) - the original uncompressed length.
if C == 0, then we simply add the payload to the end of the output buffer.
Otherwise, the payload is made up of multiple segments. The segments are concatenated together.
Each segment has a 1-3 byte header. The first 3 bits of the segment header determine the type. There are 3 segment types:
- Literal Run
- Short Backreference
- Long Backreference
| Header | payload
+--------+=========================
|000LLLLL| RunLength bytes of data
+--------+=========================
- L is a uint5
- RunLength = L + 1
+--------+--------+
|RRRXXXXX|XXXXXXXX|
+--------+--------+
- R is a uint3 between 1 and 6 (0 is used for literal run, 7 is used for long backreference)
- X is a uint13
- Offset = -(X + 1)
- RunLength = R + 2
+--------+--------+--------+
|111XXXXX|RRRRRRRR|XXXXXXXX|
+--------+--------+--------+
- X is a uint13
- R is a uint8
- Offset = -(X + 1)
- RunLength = R + 9
Backreferences are decoded by looking at the current output buffer. Move Offset
bytes back from the end,
This is the start. take RunLength
bytes. If there are not enough bytes, cycle through the available bytes
until you have accumulated RunLength
bytes. Add this to the end of the output buffer.
beginning output buffer = "123abc"
Backreference is a short backreference, and it says Offset
= -3, and RunLength
= 7. Moving back 3 bytes
points us to a
, so we begin taking bytes from there. After taking 3 bytes, we hit the end of the output buffer.
Repeat the above until we have 7 bytes. In this case, we would get "abcabca"
to add to the end of the
output buffer.