How to parallelize across tmp directories using full paths and a wrapper script. This assumes all scripts use current working directory for all processing.
Use case is for software that does not take advantage of multiple cores processing independent data. This means data can be split and processed separately.
git clone https://gist.github.com/4e1bafb3c464879668d3.git xargs_example
cd xargs_example
./main.sh
find tmp
Clean up:
rm -rf tmp
- Or alternatively
git clean -xfd
main.sh
- the main script to launch all parallelized processing.setup.sh
- creates the temporary directories and sets up arguments forwrapper.sh
.filter.sh
- used bysplit
command to output data into different directories.wrapper.sh
- wraps all processing scripts. This script is designed to be executed by xargs with parallelism. It willcd
to the temporary directory and then execute all processing scripts in the context of that working directory.process.sh
- an example processing script that uses the current working directory for output.
./setup.sh | xargs -P0 -l2 ./wrapper.sh
xargs
options:
-P0
launches as many parallel processes as possible. If you want to limit it to, say,4
concurrent processes then change it to-P4
.-l2
is the max args to pass into the script. Mysetup.sh
script is "echoing" 2 arguments that need to be passed. So-l2
tellsxargs
to read two lines and pass both in as arguments to the script. You can process N number of args this way per run of eachwrapper.sh
script.
Here's output of that command:
^_^[sam@autopsy:~/sandbox/xargs_example]$ ls
main.sh process.sh README.md setup.sh wrapper.sh
^_^[sam@autopsy:~/sandbox/xargs_example]$ ./main.sh
$1: tmp/tmp.kEnlmRbamS
$2: somearg1
$1: tmp/tmp.dRdpOpaJUS
$2: somearg2
$1: tmp/tmp.K2fI8YWwqP
$2: somearg3
^_^[sam@autopsy:~/sandbox/xargs_example]$ ls
main.sh process.sh README.md setup.sh tmp wrapper.sh
^_^[sam@autopsy:~/sandbox/xargs_example]$ find tmp
tmp
tmp/tmp.kEnlmRbamS
tmp/tmp.kEnlmRbamS/datafile.txt
tmp/tmp.kEnlmRbamS/datafile.txt.md5
tmp/tmp.dRdpOpaJUS
tmp/tmp.dRdpOpaJUS/datafile.txt
tmp/tmp.dRdpOpaJUS/datafile.txt.md5
tmp/tmp.K2fI8YWwqP
tmp/tmp.K2fI8YWwqP/datafile.txt
tmp/tmp.K2fI8YWwqP/datafile.txt.md5