Location: bzdata/bzbackup/bzdatacenter/
Purpose: Complete history of every backup action — file uploads, deletions, expirations, and large-file reassembly. This is the primary source for reconstructing churn and backup activity over time.
Scale: ~570 files spanning March 2020 to March 2026, totaling 15 GB and 45.8 million lines.
Encoding: UTF-8 text, tab-delimited, one record per line. No header row.
File naming: bz_done_YYYYMMDD_0.dat where the date indicates when the batch
was created. Files are created every 2-4 days during active backup periods.
Each line has 14 tab-separated fields:
5 + m-- 20260101003840 4_h{HOST_GUID}_f{FILE_ID}_d20260101_m003840_c000_v0001408_t0038 u06 {FILE_ID} k0_n00000 {SHA1_HASH} {TIMESTAMP_A} {TIMESTAMP_B} - 1677 /Users/{username}/src/.../output-lib
| # | Field | Description |
|---|---|---|
| 1 | Version | Format version, always 5 in observed data |
| 2 | Action | The backup action performed (see Action Types below) |
| 3 | Flags | Three-character flag field (see Flags below) |
| 4 | Timestamp | When this action occurred: YYYYMMDDHHMMSS (GMT) |
| 5 | File descriptor | Structured ID (see File Descriptor below) |
| 6 | Upload thread | uNN where NN is the upload thread number (hex), or u-- for non-upload actions |
| 7 | File ID | Hex file ID, matches the _f component of field 5. Zero-padded to 16 hex chars |
| 8 | Chunk info | k{type}_n{chunk_number} (see Chunking below) |
| 9 | SHA-1 hash | SHA-1 of the file content. 40-char hex, or all dashes ---...--- for deletions/expirations |
| 10 | Timestamp A | Hex milliseconds since epoch — appears to be the file's creation time (or first-seen time) |
| 11 | Timestamp B | Hex milliseconds since epoch — appears to be the file's last-modified time |
| 12 | Chunk file ref | - for single-part files, or cfXXXXXXXXXXXXXXXXX referencing a chunk's file ID |
| 13 | Size | File size in bytes (decimal). For chunked uploads, this is the chunk size (e.g., 10485760 = 10 MB) |
| 14 | Path | Absolute file path |
| Action | Meaning | Description |
|---|---|---|
+ |
Backed up | File was uploaded (new file or updated content). Most common action. |
= |
Re-verified | File content confirmed unchanged, backup record refreshed. Appears during periodic re-verification sweeps. |
- |
Deleted | File no longer exists on disk. Backblaze records the deletion. File descriptor uses placeholder pattern r_h..._f----------------_d--------_m------_c000_v-------_t----. Size is 0. |
x |
Expired | File removed from backup (may have been deleted earlier, now purged from retention). Similar placeholder descriptor to - but retains the original file descriptor. SHA-1 and timestamps A use dashes. |
! |
Reassembly | Large file chunk reassembly record. Appears after all chunks of a large file have been uploaded. Uses placeholder descriptor. Size reflects the total reassembled file size. |
Three-character field observed with these values:
| Flags | Frequency | Likely meaning |
|---|---|---|
m-- |
Most common | Modified file (content changed) |
d-- |
Common | De-duplicated or data-only (content matches existing backup) |
--- |
Common | No special flags (used for deletions, expirations, chunks) |
dp- |
Rare | De-duplicated with some property flag |
b-- |
Rare | Possibly a "big file" flag |
Structured string encoding the backup operation context:
4_h{HOST_GUID}_f{FILE_ID}_d20260101_m003840_c000_v0001408_t0038
| Component | Meaning |
|---|---|
4 |
Descriptor format version |
h{HOST_GUID} |
Host GUID (identifies this computer) |
f{FILE_ID} |
File ID (hex, unique per file path) |
d20260101 |
Date of action (YYYYMMDD) |
m003840 |
Time of action (HHMMSS) |
c000 |
Counter/sequence (usually 000) |
v0001408 |
Volume/version identifier |
t0038 |
Thread or transaction ID |
For deletions (- action), the descriptor uses a placeholder pattern:
r_h{HOST_GUID}_f----------------_d--------_m------_c000_v-------_t----
Files are either single-part or chunked:
| Pattern | Meaning |
|---|---|
k0_n00000 |
Single-part file (not chunked). The overwhelming majority of entries. |
k5_nXXXXX |
Chunk number XXXXX (hex) of a large file split into 10 MB parts |
k1_nXXXXX |
Appears in ! (reassembly) records. n is the total chunk count. |
k-_n----- |
Placeholder for deletions/expirations |
When a large file is chunked (k5):
- Field 12 contains
cfXXXXXXXXXXXXXXXXX— a chunk-specific file ID - Field 13 is the chunk size (typically
10485760= 10 MB, except the final chunk) - Multiple consecutive lines share the same path but have different chunk numbers
| Pattern | Meaning |
|---|---|
uNN |
Upload thread number (two hex digits, e.g., u06, u10, u16) |
u-- |
No upload thread (used for deletions, expirations, reassembly records) |
The backup client runs multiple upload threads in parallel (observed up to 8+
concurrent threads based on config num_backup_threads).