Skip to content

Instantly share code, notes, and snippets.

@hrwgc
Last active June 19, 2023 15:32
Show Gist options
  • Save hrwgc/3fedab87eb937772ca58 to your computer and use it in GitHub Desktop.
Save hrwgc/3fedab87eb937772ca58 to your computer and use it in GitHub Desktop.
aws-cli get total size of all objects within s3 prefix. (mimic behavior of `s3cmd du` with aws-cli)
#!/bin/bash
function s3du(){
bucket=`cut -d/ -f3 <<< $1`
prefix=`awk -F/ '{for (i=4; i<NF; i++) printf $i"/"; print $NF}' <<< $1`
aws s3api list-objects --bucket $bucket --prefix=$prefix --output json --query '[sum(Contents[].Size), length(Contents[])]' | jq '. |{ size:.[0],num_objects: .[1]}'
}
s3du $1;
@joech4n
Copy link

joech4n commented Jan 29, 2016

If you want to sum by top level prefixes within a bucket, you can also try something like this

@sasikanumuri
Copy link

i tried running the script as is, but i couldn't succeed, can you help me with this. Thanks

@clintval
Copy link

clintval commented Dec 21, 2017

My not-so-terse adaption for printing human-readable sizes:

function aws3du(){
  bucket=`cut -d/ -f3 <<< $1`
  prefix=`awk -F/ '{for (i=4; i<NF; i++) printf $i"/"; print $NF}' <<< $1`
  aws s3api \
    list-objects \
    --bucket $bucket \
    --prefix=$prefix \
    --output text \
    --query '[sum(Contents[].Size), length(Contents[])]' \
    | while read -r size num_objects; do
      jq '. |{ size:.[0],num_objects: .[1]}' <<< "[\"$(numfmt --to=si ${size})\",${num_objects}]"
     done
}

Usage:

❯ aws3du ${s3path}
{
  "size": "328K",
  "num_objects": 1
}

Will be made simpler if something like this gets implemented: jqlang/jq#147

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment