Last active
May 20, 2019 08:44
-
-
Save bmarwell/18b57655e0c0c8a5a38d6cdf487866e4 to your computer and use it in GitHub Desktop.
zchunk splitter proposal
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
yaml: | |
extensions: | |
- ".yaml" | |
- ".yml" | |
split: | |
type: string-before | |
separators: | |
- "\0-" | |
- "\0[a-zA-Z]" | |
min_chunk_size: 10240 | |
max_chunk_size: 1048576 | |
fedora-metadata: | |
extensions: | |
- "-comps-Everything.x86_64.xml" | |
split: | |
type: string-before | |
separators: | |
- "<group>" | |
# the min chunk size should not be too big to capture small groups. | |
# merging groups can result in never getting the same hashes for chunks. | |
min_chunk_size: 1024 | |
# set this high enough that bigger groups can easily fit into a chunk. | |
max_chunk_size: 102400 | |
sqlite3: | |
file-magic: | |
- "SQLite format 3" | |
# min: 1 KiB, max: 10 KiB. | |
min_chunk_size: 10240 | |
max_chunk_size: 102400 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment