Skip to content

Instantly share code, notes, and snippets.

@chabala
Last active April 4, 2025 01:36
Show Gist options
  • Save chabala/22ed01d7acf9ee0de9e3d867133f83fb to your computer and use it in GitHub Desktop.
Save chabala/22ed01d7acf9ee0de9e3d867133f83fb to your computer and use it in GitHub Desktop.
Merge and extract tgz files from Google Takeout

Recently found some clowny gist was the top result for 'google takeout multiple tgz', where it was using two bash scripts to extract all the tgz files and then merge them together. Don't do that. Use brace expansion, cat the TGZs, and extract:

$ cat takeout-20201023T123551Z-{001..011}.tgz | tar xzivf -

You don't even need to use brace expansion. Globbing will order the files numerically:

$ cat takeout-20201023T123551Z-*.tgz | tar xzivf -

tar has been around forever, they didn't design it to need custom scripts to deal with multipart archives. Since it's extracting the combined archive, there's no 'mess of partial directories' to be merged. It just works, as intended.

An additional tip, courtesy of Dmitriy Otstavnov (@bvc3at): if you have pv available, you can track the progress of the extraction:

> pv takeout-* | tar xzif -
 190GiB 2:37:54 [18.9MiB/s] [==============>                                   ] 30% ETA 5:03:49
@sanjosanjo
Copy link

sanjosanjo commented Feb 23, 2025

Thank you for the instructions on this. I'm curious if anyone can recommend a way to verify that the "cat takeout-*.tgz | tar xzivf -" command ran correctly before I delete all the .tgz files. Does tar have a verify feature that could work with this group of files from Takeout?

@chabala
Copy link
Author

chabala commented Feb 23, 2025

You could pipe into gunzip -t to verify the gzip container:

$ cat takeout-*.tgz | gunzip -t

But I'd just perform the extraction and see if any errors are thrown. The underlying tar archive has no inherent data validation.

Or, there's tar --compare for validating that the extracted data matches the archive, if you're worried about that: https://unix.stackexchange.com/a/469985/19124

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment