Skip to content

Instantly share code, notes, and snippets.

@JProffitt71
Created February 17, 2014 04:29
Show Gist options
  • Save JProffitt71/9044744 to your computer and use it in GitHub Desktop.
Save JProffitt71/9044744 to your computer and use it in GitHub Desktop.
Uses s3cmd to manually remove files older than specified date (##d|m|y) in a specified bucket
#!/bin/bash
# Usage: ./s3cmdclearfiles "bucketname" "30d"
s3cmd ls s3://$1 | grep " DIR " -v | while read -r line;
do
createDate=`echo $line|awk {'print $1" "$2'}`
createDate=`date -j -f "%Y-%m-%d %H:%M" "$createDate" +%s`
olderThan=`date -j -v-$2 +%s`
if [[ $createDate -lt $olderThan ]]
then
fileName=`echo $line|awk {'print $4'}`
if [[ $fileName != "" ]]
then
printf 'Deleting "%s"\n' $fileName
s3cmd del "$fileName"
fi
fi
done;
@samjco
Copy link

samjco commented Oct 16, 2018

Where do we edit the date and the bucket within your code?

@NitsPatel1
Copy link

How to remove files from the subfolder of bucket ?
e.g. "bucket/subfolder/filename" . I want to delete files from "subfolder" which is older than 30 days

@KeyCoreKH
Copy link

Where do we edit the date and the bucket within your code?

You dont ;)
You use the script by firing the command ./s3cmdclearfiles "bucketname" "30d"
So if you want a different date, you change the "30d" part

@pat-s
Copy link

pat-s commented Nov 25, 2020

Thanks for this script!
For others arriving here, note that the date used above is BSD date.

Here is a GNU date approach:

#!/bin/bash

# Usage: ./s3cmdclearfiles "bucketname" "7 days"

s3cmd ls s3://$1 | grep " DIR " -v | while read -r line;
  do
    createDate=`echo $line|awk {'print $1" "$2'}`
    createDate=$(date -d "$createDate" "+%s")
    olderThan=$(date -d "$2 ago" "+%s")
    if [[ $createDate -le $olderThan ]];
      then
        fileName=`echo $line|awk {'print $4'}`
        if [ $fileName != "" ]
          then
            printf 'Deleting "%s"\n' $fileName
            s3cmd del "$fileName"
        fi
    fi
  done;

@tobiasboyd
Copy link

How to remove files from the subfolder of bucket ?
e.g. "bucket/subfolder/filename" . I want to delete files from "subfolder" which is older than 30 days

@NitsPatel1 I confirmed you can pass in a path with subdirectories (by commenting out the actual s3cmd del to see what was going to be deleted), no need to worry, you can call it as ./s3cmdclearfiles "bucket/path/to/somewhere" "7d"

@srikoushik
Copy link

Example:
BUCKET_NAME/home/1.jpg
BUCKET_NAME/home/2.jpg
BUCKET_NAME/world/3.jpg
BUCKET_NAME/hello/4.jpg

I want to delete the recursive files in the path. Also, I want to exclude certain paths. In this case, I want to exclude hello/. How do I achieve the same with this script?

@kp-emagine
Copy link

This worked better for me:
change remote and keep for the bucket and number of days to keep
`remote="s3://thebucket";
keep=14

s3cmd ls $siteremote --recursive | while read -r line; do
createDate=echo $line|awk {'print $1'}
createDate=date -d"$createDate" +%s
olderThan=date --date "$keep days ago" +%s
if [[ $createDate -lt $olderThan ]]; then
fileName=echo $line|awk {'print $4'}
if [[ $fileName != "" ]]; then
printf 'Deleting "%s"\n' $fileName
s3cmd del "$fileName"
fi
fi
done;`

@eazylaykzy
Copy link

I'm running it in a container as a cron job (using crazymax/swarm-cronjob image), I had problem with the date inside a container, below is my setup;

Service

version: '3.9'

services:
  clean_spaces:
    image: d3fk/s3cmd
    entrypoint: ash /root/clean_spaces.sh $do_spaces_bucket/$backup_prefix 2d
    volumes:
      - ./:/s3
      - ./:/root
    deploy:
      mode: replicated
      replicas: 0
      labels:
        swarm.cronjob.replicas: 1
        swarm.cronjob.enable: "true"
        swarm.cronjob.skip-running: "false"
        swarm.cronjob.schedule: "0 */40 * * * *"
      restart_policy:
        condition: none
    labels:
      logging: "promtail"
      logging_jobname: "swarm_clean_spaces_logs"

Script

#!/bin/ash

# remove old DO-spaces files to free up spaces
# Usage: ./clean_spaces.sh "bucketname" "30d"

# install coreutils for dates
apk add --update coreutils

s3cmd ls s3://"$1" --recursive | while read -r line; do
  createDate=$(echo "$line" | awk \{'print $1'\})
  olderThan=$(date -d "$2 days ago" '+%Y-%m-%d')

  createDate=${createDate//-/} # strip off "-" from date string
  olderThan=${olderThan//-/} # strip off "-" from date string

  fileName=$(echo "$line" | awk \{'print $4'\})

  if [[ $createDate -lt $olderThan ]] && [[ $fileName != *".pbm.init"* ]]; then
    echo "File Name is: $fileName"
    if [[ $fileName != "" ]]; then
      printf 'Deleting "%s"\n' "$fileName"
      s3cmd del "$fileName"
    fi
  fi
done

@ajdevtechgithub
Copy link

Hey Team,

I want to do same thing my s3 is on wasabi not on AWS. i want to delete 60 days older data on path like "bucketnameA/B/C/ files * " but it is not working good i have error in timestamp unix and wasabi time last modified time.

@eazylaykzy
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment