Skip to content

Instantly share code, notes, and snippets.

@stokito
Last active January 29, 2025 23:12
Show Gist options
  • Save stokito/c588b8d6a6a0aee211393d68eea678f2 to your computer and use it in GitHub Desktop.
Save stokito/c588b8d6a6a0aee211393d68eea678f2 to your computer and use it in GitHub Desktop.
Deterministic reproducible zip and tar.gz archives

This is very important for reproducible builds that is a basics for secure delivery.

And archive preserves a user and group (usually only their ids uid/gid) and time of last modification mtime. The time is almost always not important so you can set standard static reproducible date 1 Feb 1080. The date is used in many tools like Maven, Gradle etc. Or instead you can use SOURCE_DATE_EPOCH env variable and use a date from git log. Owner uid/gid can be just zeroed.

deterministically archive folder to .tar.gz and remove the folder

reproducible_tar() {
  src_folder=$1
  tar \
    --remove-files  \
    --sort=name \
    --mtime='UTC 1980-02-01' \
    --owner=0 --group=0 --numeric-owner \
    --pax-option=exthdr.name=%d/PaxHeaders/%f,delete=atime,delete=ctime \
    -z \
    -cf $src_folder.tar.gz $src_folder/
  TZ=UTC touch -a -m -t 198002010000.00 $src_folder.tar.gz
}

reproducible_tar test_targz

Here

  • --remove-files will remove the source folder once it was successfully compressed.
  • --sort=name will sort files so their order in tar will be always the same
  • --mtime='UTC 1980-02-01' sets a modification time to a standard static reproducible date 1 Feb 1080 in UTC.
  • --owner=0 --group=0 --numeric-owner remove owner uid and gid.
  • --pax-option=exthdr.name=%d/PaxHeaders/%f,delete=atime,delete=ctime remove headers with access time atime and ctime.
  • -z to compress the tar to gzip format. You may use --use-compress-program 'gzip -9' to set the gzip options like max compression.

The resulted archive also is better to arrange with mtime. The touch -a -m sets the access and modification times.

Now to uncompress use:

pushd ./folder_with_archive || exit 1
tar -xf $src_folder.tar.gz
rm -f $src_folder.tar.gz

The tar doesn't have an option to delete the archive during decompression. In gzip this is a default behaviour and to keep an archive you should add the -k or --keep option. Not sure why the tar doesn't work like that. So the last command is manual remove of the archive with rm. You may have a big archive and removing parts while extracting may safe sometimes. For example on a router with a small NAND flash. Unarchiving may fail so anyway this would be dangerous. Deletion may happen only after a successful uncompression.

deterministically archive folder to .zip and remove the folder

reproducible_zip() {
  src_folder=$1
  TZ=UTC find . -exec touch --no-dereference -a -m -t 198002010000.00 {} +
  TZ=UTC zip -q --move --recurse-paths --symlinks -X $src_folder.zip $src_folder
  TZ=UTC touch -a -m -t 198002010000.00 $src_folder.zip
}

reproducible_zip test_zip

The ZIP doesn't an option to set the mtime so we have to change the mtime of all files and symlinks in the folder and only then zip it.

Zip command options:

  • -q is quite
  • --move or -m will remove files once zip is complete
  • --recurse-paths to compress all subfolders
  • --symlinks add symlinks too, otherwise they'll ignored
  • -X or --no-extra (not supported for some reason) is used to remove uid/gid fields.

The resulted archive also is better to arrange with mtime. The touch -a -m sets the access and modification times.

To unarchive zip you also have to specify UTC timezone otherwise it will set files time in local timezone. If you have symlinks the additionally also you'll have to touch them to set mtime.

pushd ./folder_with_archive || exit 1
TZ=UTC unzip -q $src_folder.zip
# unzip doesn't restore mtime of symlinks (bug?), so update it manually
TZ=UTC find . -exec touch --no-dereference -a -m -t 198002010000.00 {} +
rm -f $src_folder.zip

The unzip doesn't have an option to delete the archive during decompression. So the last command is manual remove of the archive with rm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment