Last active
September 7, 2025 11:52
-
-
Save programminghoch10/7b240002e3ac645fdb01478619e7bf5c to your computer and use it in GitHub Desktop.
Simple bash script parallelization using semaphores
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/bash | |
| SEMPATH="/tmp" | |
| SEMNAME="" | |
| semtake() { | |
| local name="$1" | |
| [ -z "$name" ] && echo "Missing semaphore name!" && return 1 | |
| local j="$2" | |
| [ -z "$2" ] && j=$(nproc) | |
| [ -n "$SEMNAME" ] && echo "Already have $SEMNAME" && return 1 | |
| while true; do | |
| for i in $(seq 1 $j); do | |
| SEMNAME=".semlock-$name-$j-$i" | |
| mkdir "$SEMPATH/$SEMNAME" 2>/dev/null && break 2 | |
| done | |
| sleep 1 | |
| done | |
| trap semgive EXIT | |
| } | |
| semgive() { | |
| [ -z "$SEMNAME" ] && return | |
| rmdir "$SEMPATH/$SEMNAME" &>/dev/null || true | |
| SEMNAME="" | |
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/bash | |
| [ -z "$(command -v inotifywait)" ] && echo "inotify-tools need to be installed for $0 to work!" >&2 && return 1 | |
| SEMPATH="/tmp" | |
| [ ! -d "$SEMPATH" ] && echo "$SEMPATH is not a valid directory" >&2 && return 1 | |
| ! (return 0 2>/dev/null) && echo "$0 can only be sourced, not executed" >&2 && exit 1 | |
| #SEMNAME="" | |
| #SEMNAMEID="" | |
| semtake_pool() { | |
| local SEMNAME="$1" | |
| local j="$2" | |
| for i in $(seq 1 "$j"); do | |
| SEMNAMEID="$i" | |
| mkdir "$SEMPATH/$SEMNAME-$SEMNAMEID" 2>/dev/null || continue | |
| return 0 | |
| done | |
| unset SEMNAMEID | |
| return 1 | |
| } | |
| semtake() { | |
| local name="$1" | |
| [ -z "$name" ] && echo "Missing semaphore name!" >&2 && return 1 | |
| local j="$2" | |
| [ -z "$2" ] && j=$(nproc) | |
| [ -n "$SEMNAMEID" ] && echo "Already have $SEMNAME" >&2 && return 1 | |
| SEMNAME=".semlock-$name" | |
| until semtake_pool "$SEMNAME" "$j"; do | |
| local i | |
| i="$(find "$SEMPATH" -maxdepth 1 -type d -name "$SEMNAME-wait-*" 2>/dev/null | sed 's/^.*-\([[:digit:]]*\)$/\1/' | sort -n | tail -1)" | |
| [ -z "$i" ] && i=0 | |
| local SEMWAITNAME | |
| while true; do | |
| SEMWAITNAME="$SEMNAME"-wait-$i | |
| i=$((i+1)) | |
| mkdir "$SEMPATH"/"$SEMWAITNAME" &>/dev/null || continue | |
| break | |
| done | |
| inotifywait --quiet --quiet --event delete_self "$SEMPATH"/"$SEMWAITNAME" | |
| rmdir "$SEMPATH"/"$SEMWAITNAME" &>/dev/null || true | |
| done | |
| trap semgive EXIT | |
| } | |
| semgive() { | |
| [ -z "$SEMNAME" ] && return | |
| [ -z "$SEMNAMEID" ] && return | |
| rmdir "$SEMPATH"/"$SEMNAME"-"$SEMNAMEID" &>/dev/null || true | |
| unset SEMNAMEID | |
| local i | |
| i="$(find "$SEMPATH" -maxdepth 1 -type d -name "$SEMNAME-wait-*" 2>/dev/null | sed 's/^.*-\([[:digit:]]*\)$/\1/' | sort -n | head -1)" | |
| [ -z "$i" ] && return | |
| local SEMWAITNAME | |
| local waiter | |
| for waiter in "$SEMPATH"/"$SEMNAME"-wait-*; do | |
| SEMWAITNAME="$SEMNAME"-wait-$i | |
| i=$((i+1)) | |
| rmdir "$SEMPATH"/"$SEMWAITNAME" &>/dev/null || continue | |
| break | |
| done | |
| unset SEMNAME | |
| } |
Author
Author
semnotify.sh
Another implementation of semlock.sh.
It features the exact same usage as semlock.sh, so the instructions and documentation from semlock.sh apply.
This variant uses inotify-tools to notify the next waiting process that the semaphore is available.
This way we achive two additional points:
- No busy waiting required, as the processes are passively waiting on filesystem changes.
- Ordered execution, because the waiting line is now numbered and semaphores will be distributed "first come, first serve"
This can be used as a drop-in replacement to semlock.sh.
If you have inotify-tools installed, simply download semnotify.sh, rename it to semlock.sh and replace the other implementation.
Author
Reserved
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
semlock.shThis bash script contains two functions making parallelization of bash scripts very easy.
Motivation
Many Semaphore implementations for bash (such as
parallel --sem) force the user to define the executed task as arguments,because the executed task has to be wrapped by taking the semaphore before the task and giving the semaphore back after the task.
This has the major drawback that an existing script has to be rewritten completely to fit to the semaphore interface.
I had a lot of scripts though where I have a for loop iterating over multiple files, where each iteration could be done in parallel, but multiple commands had to be executed for each file.
So I created my own implementation of semaphores which can be wrapped around an entire code block within a bash script.
Interface Specification
The script defines two methods:
semtake <name> [count]takes a semaphore with the namenameand allows up tocountprocesses with this semaphore at the same time. Settingcountto 1 will only allow 1 process with that semaphore at the same time. Defaultcountis the amount of available processor threads.semgivereturns the previously taken semaphoresemtakemay only be called once per shell,semgivemay only be called aftersemtakehas been called within the same shell earlier.semtakewill set up a trap to give back the semaphores when the shell exits for you, so you don't have to callsemgiveexplicitly.Migration
Let's assume you have a shell script with a
forloop similar to this:which could be parallelized but your computer does not have the resources to process every file simulaneously, but does have multiple threads which could be used.
With
semlock.shonly minimal refactoring is required for parallelization.semlock.shfunctions withsource semlock.sh(and) &semtakeright after(fileprocessis this semaphores name, and the semaphore limits execution to2threads.semtakefor an availlable semaphore.waitto the end of the loop to wait for all threads to finish.