Skip to content

Instantly share code, notes, and snippets.

@rozgo
Created December 17, 2014 23:20
Show Gist options
  • Save rozgo/a75a7c726bdd29022056 to your computer and use it in GitHub Desktop.
Save rozgo/a75a7c726bdd29022056 to your computer and use it in GitHub Desktop.
LZMA Compression Settings
LZMA Options:
level
Description: The compression level.
Range: [0;9].
Default: 5.
dictSize
Description: The dictionary size.
Range: [1<<12;1<<27] for 32-bit version or [1<<12;1<<30] for 64-bit version.
Default: 1<<24.
lc
Description: The number of high bits of the previous byte to use as a context for literal encoding.
Range [0;8].
Default: 3
Sometimes lc = 4 gives gain for big files.
lp
Description: The number of low bits of the dictionary position to include in literal_pos_state.
Range: [0;4].
Default: 0.
It is intended for periodical data when period is equal 2^value (where lp=value). For example, for 32-bit (4 bytes) periodical data you can use lp=2. Often it's better to set lc=0, if you change lp switch.
pb
Description: pb is the number of low bits of the dictionary position to include in pos_state.
Range: [0;4].
Default: 2.
It is intended for periodical data when period is equal 2^value (where lp=value).
algo
Description: Sets compression mode.
Options: 0 = fast, 1 = normal.
Default: 1.
fb
Description: Sets the number of fast bytes for the Deflate/Deflate64 encoder.
Range: [5;255].
Default: 128.
Usually, a big number gives a little bit better compression ratio and a slower compression process. A large fast bytes parameter can significantly increase the compression ratio for files which contain long identical sequences of bytes.
btMode
Description: Sets Match Finder for LZMA.
Options: 0 = hashChain mode, 1 = binTree mode.
Default: 1.
Default method is bt4. Algorithms from hc* group don't provide a good compression ratio, but they often work pretty fast in combination with fast mode.
numHashBytes
Description: Number of hash bytes. See mf={MF_ID} section here for details.
Options: 2, 3 or 4.
Default: 4.
mc
Description: Sets number of cycles (passes) for match finder.
Range: [1;1<<30].
Default: 32.
If you specify mc = 0, LZMA will use default value. Usually, a big number gives a little bit better compression ratio and slower compression process. For example, mf=HC4 and mc=10000 can provide almost the same compression ratio as mf=BT4.
writeEndMark
Description: Option for writing or not writing the end mark.
Options: 0 - do not write EOPM, 1 - write EOPM.
Default: 0.
numThreads
Description: Number of threads.
Options: 1 or 2
Default: 2
LZMA2 Options:
LZMA2 is modified version of LZMA. It provides the following advantages over LZMA:
Better compression ratio for data than can't be compressed. LZMA2 can store such blocks of data in uncompressed form. Also it decompresses such data faster.
Better multithreading support. If you compress big file, LZMA2 can split that file to chunks and compress these chunks in multiple threads.
Note: LZMA2 also supports all LZMA parameters, but lp + lc cannot be larger than 4.
blockSize
Description: Sets chunk size.
Default: dictSize * 4.
numBlockThreads
Description: Set the number of threads per chunk(block).
numTotalThreads
Description: The maximum number of threads LZMA2 can use.
Note: LZMA2 uses: 1 thread for each chunk in x1 and x3 modes; and 2 threads for each chunk in x5, x7 and x9 modes. If LZMA2 is set to use only such number of threads required for one chunk, it doesn't split stream to chunks. So you can get different compression ratio for different number of threads.
I think that in order to get more information on this subject you have to study in a more profound way the LZMA. There are very few examples on the internet about it and the documentation is quite incomplete.
More Info Here:
http://sevenzip.sourceforge.jp/chm/cmdline/switches/method.htm
http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Markov_chain_algorithm
http://linux.die.net/man/1/lzma
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment