Skip to content

Instantly share code, notes, and snippets.

@brendano
Created September 19, 2011 03:30
Show Gist options
  • Select an option

  • Save brendano/1225920 to your computer and use it in GitHub Desktop.

Select an option

Save brendano/1225920 to your computer and use it in GitHub Desktop.
Condor file parallel job script
#!/bin/zsh
# vim:ft=zsh
# Say you want 10 parallel processes. Condor submit file should look like:
#
# Executable = ./par_run.sh
# Arguments = $(Process) 10 TheFileList /output/directory
# Queue 10
#
# Then every process runs on 1 out of 10 files.
set -eu
myproc=$1
numproc=$2
filelist=$3
outdir=$4
myfilelist=$filelist.proc=$myproc.host=$(hostname -s).pid=$$
awk "NR % $numproc == $myproc" < $filelist > $myfilelist
for infile in $(cat $myfilelist); {
outfile=${infile:r:t}.smalltweet
cmd="cat $infile | zcat | python hose_filter.py > $outdir/$outfile"
date
echo "$cmd"
(time (eval $cmd) ) 2>&1 ## Comment out for testing
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment