Skip to content

Instantly share code, notes, and snippets.

@aerostitch
Last active August 29, 2015 14:07
Show Gist options
  • Save aerostitch/22992e88315215f100b8 to your computer and use it in GitHub Desktop.
Save aerostitch/22992e88315215f100b8 to your computer and use it in GitHub Desktop.
Get the AWS' etag/checksum from an s3:// uri with the required checks.
#!/usr/bin/env ruby
require 'aws/s3'
##
# This little gist returns the checksum of and aws file.
# This can be the checksum of the file or, if the file has been uploaded in a
# multipart format, it's the checksum of the binary concatenation of the
# checkusms of the chunks of the multiparts followed by the number of chunks of
# the multipart object.
#
# Tested with aws-sdk-v1 gem.
#
# To calculate the same value for the local file (when you know the size of the
# chunks) you can use the following gist:
# https://gist.github.com/aerostitch/538eda0b2d1d8dd914bf
#
# == Parameters
# aws_access_key_id::
# The AWS access key id to use for accessing the file.
#
# aws_secret_access_key::
# The AWS secret access key to use for accessing the file.
#
# s3_uri::
# The s3://<bucket>/file/path -formatted uri of the file you want the checksum
# of.
#
# == Returns:
# The checksum of the file if the s3 uri is correct.
#
# == Examples:
# Let's say that your AWS key id is: ABCDEFGHIJKLMNOPQRS
# and that your secret AWS key is: abcdefghijklmnopqrstuvwxyz/1234567890za
# and that you want to print the checksum of the file located at:
# s3://my_bucket/documents/myfile.tar.gz
# Then you just need to do:
# puts get_AWS_file_checksum(
# 'ABCDEFGHIJKLMNOPQRS',
# 'abcdefghijklmnopqrstuvwxyz/1234567890za',
# 's3://my_bucket/documents/myfile.tar.gz',
# )
#
#
# Author:: Joseph Herlant ([email protected])
# Copyright:: Copyright (c) 2014 Joseph Herlant
# License:: Distributed under the terms of the Apache 2 license
#
def get_AWS_file_checksum(aws_access_key_id, aws_secret_access_key, s3_uri)
s3aws = AWS::S3.new(
:access_key_id => aws_access_key_id,
:secret_access_key => aws_secret_access_key,
)
raise("Incorrect format for s3_uri parameter #{s3_uri}.") unless /^s3:\/\/.*\/.*$/ =~ s3_uri
# Extracting s3 bucket and file path from URI
aws_split = s3_uri.split("/")
aws_split.shift(2)
aws_bucket = aws_split.shift
aws_key = aws_split.join('/')
bucket = s3aws.buckets[aws_bucket]
raise("Bucket #{aws_bucket} does not exists!") unless bucket.exists?
obj = bucket.objects[aws_key]
raise("No file found at uri: s3://#{aws_bucket}/#{aws_key}") unless obj.exists?
# Cleaning up the annoying quotes from the etag
obj.etag.tr('"','')
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment