Created
January 30, 2023 12:51
-
-
Save sekomer/8571f9e9e1f35fc904da216afd785316 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# function to extract each tar file | |
extract_tar() { | |
local file="$1" | |
local folder="$(basename "$file" .tar.bz2)" | |
mkdir "$folder" | |
tar -xjf "$file" -C "$folder" | |
} | |
# get the number of CPUs | |
num_cpu=$(grep -c ^processor /proc/cpuinfo) | |
# create a semaphore to limit the number of parallel processes | |
semaphore=$(mktemp -u) | |
mkfifo "$semaphore" | |
exec 3<>"$semaphore" | |
for ((i=0;i<num_cpu;i++)); do | |
echo >&3 | |
done | |
# loop through all .tar.bz2 files in the current directory | |
for file in *.tar.bz2; do | |
read -u 3 | |
( | |
extract_tar "$file" | |
echo >&3 | |
) & | |
done | |
# wait for all background jobs to finish | |
wait | |
# clean up the semaphore | |
exec 3>&- | |
rm "$semaphore" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
extract tar files in parallel
This is a bash script that extracts all
.tar.bz2
files in the current directory into separate folders. The script runs multiple extractions in parallel to speed up the process, with a limit ofnum_cpu
parallel processes to avoid overloading the CPU. The script uses asemaphore
mechanism to limit the number of parallel extractions and ensures that all extractions are complete before the script exits.If the type of tar archive changes (e.g. from .tar.bz2 to .tar.gz), the tar command in the extract_tar function should be updated accordingly.
Note: The options used with the tar command may vary depending on the type of tar archive being extracted. Refer to the tar manual for more information.