Skip to content

Instantly share code, notes, and snippets.

@ATpoint
Last active October 9, 2024 09:00
Show Gist options
  • Save ATpoint/184bf38eaf31ac3efed669ca7094689a to your computer and use it in GitHub Desktop.
Save ATpoint/184bf38eaf31ac3efed669ca7094689a to your computer and use it in GitHub Desktop.
Compare md5sums in a file (output of md5sum) with current md5 of these files
#!/bin/bash
# Read existing md5sums and use to confirm integrity of fastq files
#SBATCH --nodes=1
#SBATCH --cpus-per-task=36
#SBATCH --partition=normal
#SBATCH --time=08:00:00
#SBATCH [email protected]
#SBATCH --job-name=md5checker
module load palma/2021a GCCcore/10.3.0 parallel/20210622
set -ue pipefail
DIR="/path/to/parent/dir/"
JOBS="$SLURM_CPUS_PER_TASK"
# Compare md5sums from a MD5 file with md5sums from files in the same directory.
# Will print to stdout whether ok or not
function CompareMd5Sums {
MD5FILE="$1"
BASEDIR="$(dirname $MD5FILE)"
OUTLOG="${BASEDIR}/$(date +"%Y%m%d"_md5check.txt)"
awk -F " " -v basedir="$BASEDIR" '{print $1, basedir"/"$2}' < "$MD5FILE" | md5sum -c --status /dev/stdin && echo "$BASEDIR is ok" || echo "$BASEDIR has errors"
}; export -f CompareMd5Sums
find "$DIR" -type f -name 'MD5.txt' | parallel -j "$JOBS" "CompareMd5Sums {}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment