AWS S3 Bucket Tools

AWS Bucket Policies

AWS documentation

A python script to parse a policy from a policy document:

#!/usr/bin/env python

# This is bucket_policy_doc.py

import json
from json.decoder import JSONDecodeError
import sys

try:
    policy = json.loads(sys.stdin.read())
    policy_dict = json.loads(policy["Policy"])
    sys.stdout.write(json.dumps(policy_dict, indent=2))
    sys.stdout.write("\n")

except JSONDecodeError as err:
    pass

Iterate over buckets to get the bucket policy documents.

#!/usr/bin/env bash

for bucket in $(aws s3 ls | awk '{ print $3 }'); do
    echo $bucket
    aws s3api get-bucket-policy --bucket $bucket > "bucket_policy_${bucket}.json"
    cat "bucket_policy_${bucket}.json" | ./bucket_policy_doc.py > "bucket_policy_doc_${bucket}.json"
done

Edit the bucket_policy_doc_*.json content and then try to PUT it to the bucket, e.g.

aws s3api put-bucket-policy --bucket XYZ --policy file://bucket_policy_doc_XYZ.json

AWS Bucket Data

#!/usr/bin/env bash

SCRIPT_PATH=$(dirname "$0")
SCRIPT_PATH=$(readlink -f "$SCRIPT_PATH")

export FORMAT=json
export AWS_DEFAULT_OUTPUT=$FORMAT

# an aws-profile must be active to access buckets
# shellcheck disable=SC2034
s3_buckets=(
  org-us-east-1-project-app-dev
  org-us-east-1-project-app-prod
)

s3_commands=(
  get-bucket-accelerate-configuration
  get-bucket-acl
  get-bucket-cors
  get-bucket-encryption
  get-bucket-lifecycle
  get-bucket-lifecycle-configuration
  get-bucket-location
  get-bucket-logging
  get-bucket-notification
  get-bucket-notification-configuration
  get-bucket-ownership-controls
  get-bucket-policy
  get-bucket-policy-status
  get-bucket-replication
  get-bucket-request-payment
  get-bucket-tagging
  get-bucket-versioning
  get-bucket-website
)

# shellcheck disable=SC1090
source ~/bin/aws_profile.sh

aws-profile org-main-e1

for s3_bucket in "${s3_buckets[@]}"; do

  for s3_command in "${s3_commands[@]}"; do
    output_path="${SCRIPT_PATH}/s3-bucket-metadata/${s3_command}"
    output_file="${output_path}/${s3_bucket}.${FORMAT}"
    mkdir -p "${output_path}"
    echo "${output_file}"
    aws s3api "$s3_command" --bucket "$s3_bucket" > "${output_file}"
  done

  # TODO: these requests all require an --id option (obtained by parsing a list-bucket-* response)
  # TODO: try to get bucket intelligent tier information
  #aws s3api list-bucket-intelligent-tiering-configurations --bucket "$s3_bucket" > "/tmp/${s3_bucket}/list-bucket-intelligent-tiering-configurations.${FORMAT}"
  #for id in $(jq '.IntelligentTieringConfigurationList[].Id' "/tmp/${s3_bucket}/list-bucket-intelligent-tiering-configurations.${FORMAT}"); do
  #  aws s3api get-bucket-intelligent-tiering-configuration --bucket "$s3_bucket" --id $id > "/tmp/${s3_bucket}/get-bucket-intelligent-tiering-configuration-${id}.${FORMAT}"
  #done

  # TODO: get-bucket-analytics-configuration
  # TODO: get-bucket-inventory-configuration
  # TODO: get-bucket-metrics-configuration
done

Update Bucket Tagging

This is a good summary of s3 bucket tagging with the awscli

https://bobbyhadz.com/blog/aws-cli-tag-s3-bucket

To update bucket tags:

first GET the exiting bucket tags as a JSON document
- see tools above to get bucket metadata
edit the JSON document for a bucket to update the TagSet data
then PUT the new tags from the JSON document
- see script below to update bucket tagging

#!/usr/bin/env bash

SCRIPT_PATH=$(dirname "$0")
SCRIPT_PATH=$(readlink -f "$SCRIPT_PATH")

export FORMAT=json
export AWS_DEFAULT_OUTPUT=$FORMAT

# an aws-profile must be active to access buckets
# shellcheck disable=SC2034
s3_buckets=(
  org-us-east-1-project-app-dev
  org-us-east-1-project-app-prod
)

# shellcheck disable=SC1090
source ~/bin/aws_profile.sh

aws-profile org-main-e1

for s3_bucket in "${s3_buckets[@]}"; do

  input_path="${SCRIPT_PATH}/s3-bucket-tagging"
  input_file="${input_path}/${s3_bucket}.${FORMAT}"
  ls "$input_file"
  aws s3api put-bucket-tagging --bucket "$s3_bucket" --tagging file://"${input_file}"

done

Example Tagging

{
  "TagSet": [
    {
      "Key": "Name",
      "Value": "org-us-east-1-project-app-env"
    },
    {
      "Key": "Owner",
      "Value": "owner-iam-name"
    },
    {
      "Key": "Project",
      "Value": "project-name"
    },
    {
      "Key": "Department",
      "Value": "department-name"
    },
    {
      "Key": "Tier",
      "Value": "service"
    },
    {
      "Key": "ServiceType",
      "Value": "service-type"
    },
    {
      "Key": "ServiceName",
      "Value": "service-name"
    },
    {
      "Key": "Environment",
      "Value": "env"
    }
  ]
}

Comparing JSON Files

See this blog post for an explanation of using jq and diff.

For example:

# after download of the existing bucket tags
cat /tmp/s3-bucket-tagging/org-us-east-1-project-app-dev.json | jq -S -f walk.filter > 1.json

# a previous local file could contain this tagging file if the bucket was previously tagged
cat s3-bucket-tagging/org-us-east-1-project-app-dev.json | jq -S -f walk.filter > 2.json

# compare the tags
meld 1.json 2.json

The walk.filter is this file:

# This is related to comparing JSON files using jq, from
# https://lucasbru.medium.com/comparison-of-json-files-9b8d2fc320ca

# Apply f to composite entities recursively, and to atoms
def walk(f):
  . as $in
  | if type == "object" then
      reduce keys[] as $key
        ( {}; . + { ($key):  ($in[$key] | walk(f)) } ) | f
  elif type == "array" then map( walk(f) ) | f
  else f
  end;
walk(if type == "array" then sort else . end)

Listing Tag Keys

For example, using jq

for f in s3-bucket-tagging/org-*.json; do echo $f; jq -S '.TagSet[].Key' $f > ${f/.json/.keys}; done

To compare files for the list of keys they contain, try these linux text processing tools:

# to list sequential pairs of output *.keys files and get a diff on the keys:
ls -1 s3-bucket-tagging/*.keys | sed '1{h;$p;d;};H;x;s/\n/ /;' > tmp.txt
while read line; do echo $line; diff $line; done < tmp.txt

# To compare files in random order, use the `shuf` command, e.g.
ls -1 s3-bucket-tagging/*.keys | shuf | sed '1{h;$p;d;};H;x;s/\n/ /;' > tmp.txt

Converting JSON to YAML documents

Use the yq command line tool.

for f in s3-bucket-tagging/org-*.json; do yq -y --indentless '.' $f > ${f/json/yaml}; done

Converting YAML to JSON documents