Skip to content

Instantly share code, notes, and snippets.

@SergeyMell
Last active January 18, 2017 19:35
Show Gist options
  • Save SergeyMell/3135e6c574c41ec035cdd889de5b3631 to your computer and use it in GitHub Desktop.
Save SergeyMell/3135e6c574c41ec035cdd889de5b3631 to your computer and use it in GitHub Desktop.
Inspired by https://gist.github.com/bantic/4080793. A ruby class to copy files from one AWS S3 bucket to another one supporting aws-sdk v.2
require 'aws-sdk'
class S3SyncService
attr_reader :from_bucket, :to_bucket, :logger
attr_accessor :debug
# from_credentials and to_credentials are both hashes with these keys:
# * :access_key_id
# * :secret_access_key
# * :bucket
# * :region
def initialize(from_credentials, to_credentials)
@from_bucket = init_bucket(from_credentials.symbolize_keys)
@to_bucket = init_bucket(to_credentials.symbolize_keys)
@object_counts = { sync: 0, skip: 0 }
end
def perform(output = STDOUT)
create_logger(output)
@logger.info 'Synchronization started'
total = @from_bucket.objects.count
progress = 0
@from_bucket.objects.each do |object|
if object_requires_sync?(object)
sync object
else
skip object
end
progress += 1
yield total, progress
end
@logger.info 'Synchronization finished.'
@logger.info "Synced #{@object_counts[:sync]}, skipped #{@object_counts[:skip]}."
end
private
def init_bucket(creds)
credentials = Aws::Credentials.new(creds[:access_key_id], creds[:secret_access_key])
s3 = Aws::S3::Resource.new(region: creds[:region], credentials: credentials)
bucket = s3.bucket(creds[:bucket])
bucket = bucket.create_bucket(creds[:bucket]) unless bucket.exists?
bucket
end
def sync(object)
logger.debug "Syncing #{stats object}"
object.copy_to(bucket: @to_bucket.name, key: object.key)
@object_counts[:sync] += 1
end
def skip(object)
@logger.debug "Skipped #{stats object}"
@object_counts[:skip] += 1
end
def object_requires_sync?(object)
to_object = @to_bucket.object(object.key)
!to_object.exists? || to_object.etag != object.etag
rescue => e
@logger.debug "#{e.message || e.inspect}"
true
end
def create_logger(output)
@logger = Logger.new(output).tap do |l|
l.level = debug ? Logger::DEBUG : Logger::INFO
end
end
def stats(object)
content_length_in_kb = object.content_length / 1024
"#{object.key} #{content_length_in_kb}k " +
"#{object.last_modified.strftime("%b %d %Y %H:%M")}"
end
end
=begin
Example usage:
from_creds = {access_key_id:"AAA", secret_access_key:"BBB", bucket:"first-bucket", region: "us-east-1"}
from_creds = {access_key_id:"CCC", secret_access_key:"DDD", bucket:"second-bucket", region: "us-east-1"}
syncer = S3SyncService.new(from_creds, to_creds)
syncer.debug = true # log each object
syncer.perform do |total_number_of_objects, progress|
# process synchronization progress
end
=end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment