Skip to content

Instantly share code, notes, and snippets.

@sekomer
Created January 30, 2023 12:51
Show Gist options
  • Select an option

  • Save sekomer/8571f9e9e1f35fc904da216afd785316 to your computer and use it in GitHub Desktop.

Select an option

Save sekomer/8571f9e9e1f35fc904da216afd785316 to your computer and use it in GitHub Desktop.
#!/bin/bash
# function to extract each tar file
extract_tar() {
local file="$1"
local folder="$(basename "$file" .tar.bz2)"
mkdir "$folder"
tar -xjf "$file" -C "$folder"
}
# get the number of CPUs
num_cpu=$(grep -c ^processor /proc/cpuinfo)
# create a semaphore to limit the number of parallel processes
semaphore=$(mktemp -u)
mkfifo "$semaphore"
exec 3<>"$semaphore"
for ((i=0;i<num_cpu;i++)); do
echo >&3
done
# loop through all .tar.bz2 files in the current directory
for file in *.tar.bz2; do
read -u 3
(
extract_tar "$file"
echo >&3
) &
done
# wait for all background jobs to finish
wait
# clean up the semaphore
exec 3>&-
rm "$semaphore"
@bobpaul
Copy link
Copy Markdown

bobpaul commented Dec 5, 2024

I really like what how you implemented the max thread count. That's clever.

BTW If you have gnu parallel installed, I think this makes basically the same results as your script:

#make 1 directory per archive
for tarfile in *.tar *.tar.*; do mkdir ${tarfile/.tar*/}; done

#extract archives to their own directory, 1 tar process per CPU core
parallel -P $(nproc) bsdtar -xf {} --directory {=s/.tar.*//=} ::: *.tar*

both the for loop to make the directories and the extract command should work on any *.tar* archives, compressed or not, but will not work on archives named like .tgz instead of .tar.gz. I use bsdtar because it does single threaded decompression (gnu tar may use multiple threads for xz and zstd and other compressiosn, which could get excessive as we're already making 1 process per cpu core)

@sekomer
Copy link
Copy Markdown
Author

sekomer commented Dec 9, 2024

Thanks for your feedback @bobpaul! I really appreciate it :)

@AndreiCherniaev
Copy link
Copy Markdown

AndreiCherniaev commented Apr 13, 2026

Let me introduce my untar version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment