Last active
August 14, 2024 13:56
-
-
Save emersonf/7413337 to your computer and use it in GitHub Desktop.
A Bash script to compute ETag values for S3 multipart uploads on OS X.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
if [ $# -ne 2 ]; then | |
echo "Usage: $0 file partSizeInMb"; | |
exit 0; | |
fi | |
file=$1 | |
if [ ! -f "$file" ]; then | |
echo "Error: $file not found." | |
exit 1; | |
fi | |
partSizeInMb=$2 | |
fileSizeInMb=$(du -m "$file" | cut -f 1) | |
parts=$((fileSizeInMb / partSizeInMb)) | |
if [[ $((fileSizeInMb % partSizeInMb)) -gt 0 ]]; then | |
parts=$((parts + 1)); | |
fi | |
checksumFile=$(mktemp -t s3md5) | |
for (( part=0; part<$parts; part++ )) | |
do | |
skip=$((partSizeInMb * part)) | |
$(dd bs=1m count=$partSizeInMb skip=$skip if="$file" 2>/dev/null | md5 >>$checksumFile) | |
done | |
echo $(xxd -r -p $checksumFile | md5)-$parts | |
rm $checksumFile |
Thank you @skchronicles
I think there's an error in the parts
calculations, now fixed below
https://gist.github.com/emersonf/7413337?permalink_comment_id=3244707#gistcomment-3244707
#!/bin/bash
set -euo pipefail
if [ $# -ne 2 ]; then
echo "Usage: $0 file partSizeInMb";
exit 0;
fi
file=$1
if [ ! -f "$file" ]; then
echo "Error: $file not found."
exit 1;
fi
partSizeInMb=$2
partSizeInB=$((partSizeInMb * 1024 * 1024)) ### I added this
fileSizeInB=$(du -b "$file" | cut -f 1) ### I edited this
parts=$((fileSizeInB / partSizeInB)) ### I edited this and the next line
if [[ $((fileSizeInB % partSizeInB)) -gt 0 ]]; then
parts=$((parts + 1));
fi
checksumFile=$(mktemp -t s3md5.XXXXXXXXXXXXX)
for (( part=0; part<$parts; part++ ))
do
skip=$((partSizeInMb * part))
$(dd bs=1M count=$partSizeInMb skip=$skip if="$file" 2> /dev/null | md5sum >> $checksumFile)
done
etag=$(echo $(xxd -r -p $checksumFile | md5sum)-$parts | sed 's/ --/-/')
echo -e "${1}\t${etag}"
rm $checksumFile
Thanks, this is quite useful.
I modified the script to speedup the hash computation and avoid generating temporary files. Link to script
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Linux users
Here is an equivalent script if you are not using OSX. I hope this helps!