Skip to content

Instantly share code, notes, and snippets.

Last active September 10, 2024 06:53
Show Gist options
  • Save SavvyGuard/6115006 to your computer and use it in GitHub Desktop.
Save SavvyGuard/6115006 to your computer and use it in GitHub Desktop.
Use boto to upload directory into s3
import boto
import boto.s3
import os.path
import sys
# Fill these in - you get them when you sign up for S3
# Fill in info on data to upload
# destination bucket name
bucket_name = 'jwu-testbucket'
# source directory
sourceDir = 'testdata/'
# destination directory name (on s3)
destDir = ''
#max size in bytes before uploading in parts. between 1 and 5 GB recommended
MAX_SIZE = 20 * 1000 * 1000
#size of parts when uploading in parts
PART_SIZE = 6 * 1000 * 1000
conn = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_ACCESS_KEY_SECRET)
bucket = conn.create_bucket(bucket_name,
uploadFileNames = []
for (sourceDir, dirname, filename) in os.walk(sourceDir):
def percent_cb(complete, total):
for filename in uploadFileNames:
sourcepath = os.path.join(sourceDir + filename)
destpath = os.path.join(destDir, filename)
print 'Uploading %s to Amazon S3 bucket %s' % \
(sourcepath, bucket_name)
filesize = os.path.getsize(sourcepath)
if filesize > MAX_SIZE:
print "multipart upload"
mp = bucket.initiate_multipart_upload(destpath)
fp = open(sourcepath,'rb')
fp_num = 0
while (fp.tell() < filesize):
fp_num += 1
print "uploading part %i" %fp_num
mp.upload_part_from_file(fp, fp_num, cb=percent_cb, num_cb=10, size=PART_SIZE)
print "singlepart upload"
k = boto.s3.key.Key(bucket)
k.key = destpath
cb=percent_cb, num_cb=10)
Copy link

freewayz commented Aug 11, 2016

Using this technique can you upload files within a sub directory inside a directory?

Copy link

This is great, just what I'm after. But how can I extract the AWS_ACCESS_KEY_ID and AWS_ACCESS_KEY_SECRET in my Python script running within Bitbucket Pipelines? Other scripts I have seen make no reference to these two properties, but connect like this:

import boto3
client = boto3.client('s3')

Although the methods are different on boto3 so it's a bit tricky...advice?

Copy link

cdsimpkins commented Oct 28, 2016

@mark-norgate The example from Amazon for Bitbucket pipelines says to set environment variables which are automatically picked up by the boto3 library. linky

Also, thanks for the script!

Copy link

haimari commented Oct 31, 2016

You also need to add '/'

sourcepath = os.path.join(sourceDir + '/' + filename)

Copy link

Won't do it recursively for sub-directories

Copy link

While syncing directory to aws server by using this code only one file is uploading where as this directory is contains 3 files. Please help me to solve this problem.

Copy link

jckail commented May 27, 2017

Thanks Man!

Copy link

This code is amazing! Thank you @SavvyGuard !

Copy link

ghost commented Jun 8, 2021

bucket = conn.create_bucket(bucket_name, location='')

S3ResponseError: S3ResponseError: 403 Forbidden

InvalidAccessKeyIdThe AWS Access Key Id you provided does not exist in our records.NOCC4UFL6U659XNJHGFME3M762MSK2KPQJMElWA0cWuCCEAOw9ObIyTn8GGe1ErsEdJeTw8aHfPX5T09QSDYT3jElLqAsGv/LPcIJhH+5ncuBdU=

what's wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment