Skip to content

Instantly share code, notes, and snippets.

@LukeNewNew
Last active August 3, 2025 16:19
Show Gist options
  • Save LukeNewNew/b55a3e87037b0e951095e3fed9ec4aec to your computer and use it in GitHub Desktop.
Save LukeNewNew/b55a3e87037b0e951095e3fed9ec4aec to your computer and use it in GitHub Desktop.

Of course. Here is a comprehensive summary of the packJPG v2.5k source code in English, detailing all the essential steps of its compression and decompression processes.

Introduction: The Core Value of packJPG

packJPG is a highly specialized, lossless re-compression utility designed exclusively for JPEG files. The term "lossless" here is critical: it does not mean converting the JPEG to a pixel-based format and then compressing it (like PNG). Instead, it means the program can take a JPEG file, compress it into a .pjg file, and then decompress it back into a JPEG file that is bitwise identical to the original. On average, it achieves a file size reduction of around 20%.

The program's effectiveness stems from its deep understanding and exploitation of the inherent inefficiencies within the standard JPEG compression scheme. While JPEG is excellent for lossy image compression, its final entropy coding stage (using Huffman coding) is not optimally efficient. packJPG replaces this with more advanced predictive modeling and a superior entropy coder (Arithmetic coding) to achieve further compression without any loss of image quality or metadata (by default).

The source code, developed by Matthias Stirner under the guidance of Prof. Dr. Gerhard Seelmann at HTW Aalen University, serves as an excellent case study in data compression, JPEG format internals, and sophisticated algorithm design.

Core Architecture: The "Deconstruct-Analyze-Reconstruct" Philosophy

packJPG operates on a three-stage philosophy:

  1. Deconstruct: It parses the JPEG file, not as an opaque binary blob, but as a collection of distinct data streams with different statistical properties.
  2. Analyze & Model: It reverses the final stage of JPEG encoding to access the core DCT (Discrete Cosine Transform) coefficients. It then applies advanced predictive models to this data to remove spatial and spectral redundancies that the standard JPEG algorithm leaves behind.
  3. Reconstruct & Re-encode: It uses a more powerful entropy coder (Arithmetic coding) to compress the refined data streams into the final PJG format.

The source code clearly reflects this architecture through its primary data structures and functions.


The Compression Pipeline: From JPG to PJG (All Steps)

The process of converting a JPG file to a PJG file is a meticulous, multi-step pipeline.

Step 1: Parsing and Separation (read_jpeg)

The program first reads the input JPEG file and intelligently separates its contents into three distinct parts:

  • hdrdata (Header Data): This stream contains all JPEG marker segments (e.g., SOI, APPx for EXIF, DQT for quantization tables, DHT for Huffman tables, SOF for frame info, SOS for scan start).
  • huffdata (Huffman-coded Data): This is the main payload of the JPEG file, containing the Huffman-coded stream of the quantized DCT coefficients. This is the primary target for optimization.
  • grbgdata (Garbage Data): This captures any data that might exist after the End of Image (EOI) marker.

Step 2: Partial Decode to DCT Coefficients (decode_jpeg)

This is a crucial step. The function decode_jpeg does not decode the image to pixels. It performs a partial decode, only reversing the final entropy coding stage. It reads the huffdata stream and, using the Huffman tables from hdrdata, decodes it back into the raw, quantized DCT coefficients.

These coefficients are stored in the central data structure: colldata[4][64]. This multi-dimensional array holds the 64 DCT coefficients for every 8x8 block for up to 4 color components (e.g., Y, Cb, Cr). This colldata array is the canvas upon which all subsequent analysis and optimization are performed.

Step 3: Advanced DC Coefficient Prediction (predict_dc)

The first DCT coefficient (the DC coefficient) represents the average color/luminance of an 8x8 block. These values are highly correlated between adjacent blocks. JPEG uses a simple differential prediction (DPCM). packJPG uses far more sophisticated predictors to model this correlation more accurately:

  • 1D-DCT Predictor (dc_1ddct_predictor): This clever method uses the AC coefficients of a block to perform a partial inverse DCT, calculating the pixel values at the block's boundary. It then uses these boundary values to predict the DC coefficient of the next block.
  • LOCO-I Predictor (dc_coll_predictor): Borrowed from the JPEG-LS lossless standard, this predictor (plocoi function) uses the DC values from the top, left, and top-left neighboring blocks to form a non-linear median prediction.

Instead of storing the DC coefficient itself, packJPG stores the prediction error (the difference between the actual DC value and the predicted value). Because the prediction is more accurate, these errors are smaller and more tightly clustered around zero, making them highly compressible.

Step 4: AC Coefficient Structural Analysis (calc_zdst_lists)

For the 63 AC coefficients (representing details), packJPG recognizes that most high-frequency coefficients are zero. To exploit this, it separates the problem of "where are the non-zero values?" from "what are those values?". This is done by creating Zero Distribution Lists:

  • zdstdata: For each 8x8 block, this stores the count of non-zero AC coefficients in the main 7x7 area (excluding the first row and column).
  • zdstxlow / zdstylow: These store the count of non-zero AC coefficients for the first row and first column, respectively.

These lists themselves form new data streams with strong spatial correlation (e.g., a detailed area will have many blocks with a high count of non-zero coefficients), which can be compressed very efficiently.

Step 5: Arithmetic Encoding and Packing (pack_pjg)

This is the final stage where all the refined data streams are compressed using an ArithmeticEncoder with sophisticated Context Modeling. The following streams are encoded in sequence into the PJG file:

  1. Header (pjg_encode_generic): Before encoding, pjg_optimize_header is called. This function replaces standard Huffman or quantization tables with short identifiers, reducing header size.
  2. Optimized Scan Order (pjg_encode_zstscan): JPEG uses a fixed Zig-Zag scan order. packJPG calculates an image-specific optimal scan order (zsrtscan) that groups non-zero coefficients earlier in the stream. This new scan order is then compressed and stored.
  3. Zero Distribution Lists (pjg_encode_zdst_high / _low): These lists are compressed using a context model based on the values of neighboring blocks.
  4. AC Coefficients (pjg_encode_ac_high / _low): This is the most complex part. The coefficients are encoded using different predictive models based on their frequency and context:
    • High-frequency ACs: Use pjg_aavrg_context, which predicts a coefficient's magnitude based on a weighted average of its neighbors.
    • Low-frequency ACs (first row/column): Use the more complex pjg_lakh_context, a predictor inspired by academic research that leverages inter-coefficient relationships for more accurate prediction.
  5. DC Prediction Errors (pjg_encode_dc): The prediction errors calculated in Step 3 are compressed using a context model based on neighboring errors.
  6. Garbage Data: A flag is encoded to indicate if trailing garbage data exists, and if so, the data itself is compressed.

The Decompression Pipeline: From PJG to JPG (All Steps)

Decompression is the precise, mirrored inverse of the compression process, ensuring a bit-for-bit identical result.

Step 1: Unpacking and Arithmetic Decoding (unpack_pjg)

The program reads the PJG file and uses an ArithmeticDecoder to decompress the data streams in the exact order they were written. It reconstructs:

  • The optimized header data.
  • The custom scan order.
  • The Zero Distribution Lists.
  • The AC coefficients and DC prediction errors.

This process populates the hdrdata and colldata structures, effectively reversing Step 5 of compression. pjg_unoptimize_header is then called to restore the full, standard tables in the header if they were replaced by identifiers.

Step 2: Reversing DC Prediction (unpredict_dc)

Using the decoded DC prediction errors and the identical prediction models from the compression stage, this function iteratively reconstructs the original DC coefficients for every block.

Step 3: Re-encoding the Huffman Stream (recode_jpeg)

This is the key to achieving a bit-identical output. The function iterates through the fully reconstructed colldata (containing all 64 original DCT coefficients for every block). Using the original Huffman tables restored in hdrdata, it performs standard Huffman encoding on the coefficients. This process meticulously regenerates the original huffdata stream.

Step 4: Merging and Final Output (merge_jpeg)

Finally, the program assembles the final JPEG file. It writes the SOI marker, followed by the fully restored hdrdata, the newly re-created huffdata, and the EOI marker, plus any trailing grbgdata. The resulting output file is a perfect replica of the original JPG.

Engineering and Code Structure Highlights

  • Modularity: The code is well-structured with clear function groups for the main interface, JPEG-specific operations, PJG-specific operations, DCT math, and prediction.
  • Dual-Mode Compilation (BUILD_LIB): The code is designed to be compiled either as a standalone command-line executable or as a static/dynamic library for integration into other applications.
  • Developer Mode (DEV_BUILD): The source includes extensive debugging functions (dump_coll, dump_info, etc.) that allow developers to inspect the intermediate data at every stage of the pipeline, which is invaluable for understanding and verification.
  • Robustness and Options: The tool includes command-line switches for verification (-ver), handling corrupted files (-p), and discarding metadata for extra space savings (-d).

Conclusion

packJPG is a testament to the power of deep algorithmic analysis. Its success lies in its ability to:

  1. Identify Inefficiency: It correctly identifies the sub-optimal nature of JPEG's standard Huffman coding.
  2. Divide and Conquer: It deconstructs the JPEG data into multiple streams, each with unique statistical properties that can be modeled and compressed independently.
  3. Advanced Modeling: It moves beyond simple differential prediction to use sophisticated, context-aware predictive models for both DC and AC coefficients.
  4. Superior Tools: It replaces the less efficient Huffman coding with the more powerful Arithmetic coding, allowing it to capitalize fully on the statistical redundancies uncovered in the analysis phase.

By meticulously reversing the JPEG entropy coding, modeling the underlying data with greater precision, and then re-encoding it with a better tool, packJPG achieves significant, truly lossless compression.


Of course. Let's take a deep dive into pack_pjg, the final and most critical stage of the packJPG compression process. This function is where all the prior analysis and data modeling are converted into a compressed bitstream.

Step 5: Arithmetic Encoding and Packing (pack_pjg)

Overall Purpose

The pack_pjg function serves as the grand finale of the compression pipeline. Its primary job is to take all the pre-processed data streams—the optimized header, the custom scan order, the zero-distribution lists, and the predicted coefficient errors—and feed them into a high-performance Arithmetic Encoder. This process transforms the highly structured, statistically refined data into a compact, single binary stream, which is the final .pjg file.

The function's efficiency relies on two core principles:

  1. Data Separation: It encodes different types of data (e.g., DC coefficients, high-frequency ACs, zero counts) using separate, specialized models.
  2. Context Modeling: For each piece of data being encoded, it uses information from already-coded neighboring data to create a "context." This context dynamically adapts the probabilities used by the arithmetic coder, leading to extremely efficient compression.

Here is a detailed breakdown of every step within the function:


Detailed Step-by-Step Breakdown

Step 1: Write the PJG File Header

Before any arithmetic encoding begins, the function writes a small, uncompressed header to the output file. This header is essential for the decompressor to identify the file and set up its initial state.

  • Magic Bytes (pjg_magic): The first two bytes written are 'J' and 'S'. This is the file's "magic number," allowing any program (including packJPG itself) to quickly identify the file as a packJPG archive without reading its entire contents.
  • Custom Settings (Optional): If the user provided custom compression settings (e.g., -s or -t switches), the auto_set flag will be false. In this case, a special marker (0x00) is written, followed by the 8 bytes representing these custom settings. This ensures that a file compressed with non-default parameters can be perfectly decompressed, as the decompressor will know which settings to use.
  • Version Number (appversion): A single byte representing the version of the packJPG algorithm is written. This is crucial for compatibility. If a user tries to decompress a file with an older or newer, incompatible version of packJPG, this byte allows the program to fail gracefully with an informative error message.

Step 2: Initialize the Arithmetic Encoder

auto encoder = new ArithmeticEncoder(*str_out);

An instance of the ArithmeticEncoder class is created. This object encapsulates the entire state of the arithmetic coding process (the low, high, and range values) and is linked to the output file stream. From this point forward, all data will be passed through this encoder object to be compressed.

Step 3: Compress the JPEG Header and Essential Metadata

This step handles the non-pixel data required to reconstruct the original JPEG.

  • Header Optimization (pjg_optimize_header): Before compression, a crucial optimization is performed. This function scans the JPEG header data (hdrdata) and replaces standard, well-known Huffman and Quantization tables with very short identifiers. This is a form of dictionary compression applied specifically to the header, significantly reducing its size.
  • Generic Header Encoding (pjg_encode_generic): The (now optimized) header data is compressed using a generic, adaptive model. This model isn't highly specialized but is effective for general-purpose byte streams. It essentially uses the value of the previously encoded byte as the context for the current byte. An end-of-data symbol is encoded to tell the decompressor where the header stream ends.
  • Encoding Metadata Bits:
    • padbit: The final bit in a JPEG's Huffman stream is often ambiguous. packJPG determines its correct value and encodes this single bit using a simple binary model (pjg_encode_bit).
    • RST Marker Errors: Some JPEGs have incorrectly placed Restart (RST) markers. A single bit is encoded to flag whether such errors exist. If they do, the list of error counts per scan (rst_err) is then compressed using the generic encoder. This metadata is vital for bit-perfect reconstruction.

Step 4: Compress Per-Component Image Data (The Main Loop)

This is the heart of the function, where the actual image data (in its DCT coefficient form) is compressed. The code iterates through each color component (e.g., Y, Cb, Cr). For each component, it encodes a series of data streams in a specific, logical order.

  1. pjg_encode_zstscan (Optimized Scan Order):

    • What: Instead of using JPEG's fixed zig-zag scan, packJPG first analyzes the image to find a custom scan order that groups non-zero AC coefficients together.
    • Why: This creates longer, more predictable runs of zeros, which are highly compressible.
    • How: It encodes the new scan order by describing its difference from the standard order, which is more efficient than encoding the entire new order from scratch. This is done using an adaptive model that tracks which coefficients are still available to be chosen.
  2. pjg_encode_zdst_high / _low (Zero Distribution Lists):

    • What: Encodes the pre-calculated counts of non-zero AC coefficients. It handles the main 7x7 block (_high) separately from the first row/column (_low) because their statistics differ.
    • Why: This separates the problem of where the data is from what the data is. The structure (the counts) is encoded first.
    • How: It uses a powerful context model. The probability for the count in the current block is conditioned on the counts of its immediate top and left neighbors. This effectively exploits the spatial correlation of image detail.
  3. pjg_encode_ac_high (High-Frequency AC Coefficients):

    • What: Encodes the values of the non-zero AC coefficients in the main 7x7 area of the DCT block.
    • Why: These coefficients represent finer details and have different statistical properties than low-frequency ACs.
    • How: It uses a sophisticated context model (pjg_aavrg_context) that predicts the magnitude of the current coefficient based on a weighted average of its already-coded spatial neighbors. The model also uses the zero-distribution count as a context. The value is then encoded hierarchically: first its bit-length, then the residual bits, and finally its sign, each with its own adaptive model.
  4. pjg_encode_ac_low (Low-Frequency AC Coefficients):

    • What: Encodes the AC coefficients from the first row and first column of the DCT block.
    • Why: These coefficients are perceptually more important and have stronger correlations than the higher-frequency ones. They benefit from a more powerful, specialized predictor.
    • How: It uses the Lakhani predictor (pjg_lakh_context), which leverages the mathematical properties of the DCT to form a highly accurate prediction based on coefficients from the neighboring block. This prediction is then used as the primary context for the arithmetic encoder. Again, the value is encoded hierarchically (bit-length, residual, sign).
  5. pjg_encode_dc (DC Prediction Errors):

    • What: Encodes the prediction errors for the DC coefficients, which were calculated in the earlier predict_dc step.
    • Why: The prediction errors are small values tightly clustered around zero, making them far more compressible than the original DC values.
    • How: It uses a context model very similar to the one for high-frequency ACs (pjg_aavrg_context), predicting the magnitude of the current error based on its spatial neighbors.

Step 5: Compress Trailing "Garbage" Data

Some tools or cameras write extra data after the JPEG End-of-Image (EOI) marker. For perfect reconstruction, this data must be preserved.

  • A single bit is encoded to flag whether this "garbage" data exists.
  • If the flag is 1, the data is compressed using the same pjg_encode_generic model used for the header.

Step 6: Finalize the Arithmetic Coding Stream

delete( encoder );

This is a small but absolutely critical step. The destructor of the ArithmeticEncoder is called. This triggers a "flush" operation, which writes out the final few bits needed to uniquely resolve the last encoded symbol and terminate the bitstream correctly. Without this step, the resulting .pjg file would be truncated and undecodable.


In summary, pack_pjg is a masterclass in applied data compression. It doesn't just apply one algorithm; it orchestrates a suite of specialized models, each tailored to a specific type of data. By separating data streams and using intelligent, adaptive context modeling at every turn, it squeezes out the statistical redundancies that standard JPEG leaves behind, achieving its impressive lossless compression ratios.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment