Created
June 3, 2016 21:49
-
-
Save cbare/ecd0507d64c90c782c49dce0e1f44984 to your computer and use it in GitHub Desktop.
An example of accessing Synapse from AWS Lambda
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
======================================================= | |
How to access Synapse from Amazon Lambda with Python | |
======================================================= | |
Here we show an example of adding a row to a Synapse table | |
through an AWS Lambda script. | |
Caveat: Any operation that requires chunked file upload | |
fails on AWS Lambda. The execution environment for | |
Lambda scripts seems not to allow access to the OS | |
resources required by multiprocessing.dummy, which | |
the Synapse Python client uses to parallelize | |
chunked upload. | |
see: https://forums.aws.amazon.com/thread.jspa?threadID=232868 | |
------- | |
Notes | |
------- | |
* In the AWS console, create a Lambda function whose "Handler" | |
has the form: spam_synapse.lambda_handler where spam_synapse.py | |
is the filename and lambda_handler(event, context) is a | |
function in that file. | |
* Create a Scheduled Event | |
* Create a key (in the right region) using AWS KMS | |
* Use that key to encrypt a Synapse API key | |
* Set the Synapse cache to live inside the /tmp dir | |
* Package script with dependencies and upload | |
--------------------------------------- | |
Packaging the script and dependencies | |
--------------------------------------- | |
Note that we have to package dependencies (except for | |
boto) along with our app. For complete instructions, | |
see the AWS Lambda docs for the topic "Creating a | |
Deployment Package (Python)", here: | |
http://docs.aws.amazon.com/lambda/latest/dg/lambda-python-how-to-create-deployment-package.html | |
I created an empty directory called spam_synapse and | |
installed my script and its dependencies there. Lambda | |
wants a zip file with the contents of the directory, | |
not the directory itself: | |
$ pip2 install synapseclient -t /path/to/spam_synapse | |
$ pip2 install setuptools -t /path/to/spam_synapse | |
$ pip2 install backports.csv -t /path/to/spam_synapse | |
$ pip2 install future -t /path/to/spam_synapse | |
$ cd /path/to/spam_synapse | |
$ zip -r ../spam_synapse.zip * | |
""" | |
import base64 | |
import boto3 | |
from datetime import datetime | |
import synapseclient | |
from synapseclient import Table, RowSet, Row | |
import synapseclient.utils as utils | |
## The file systems your Lambda script has access to is | |
## read-only except for the /tmp directory, so we'll | |
## have to put the Synapse cache there. | |
synapseclient.cache.CACHE_ROOT_DIR = "/tmp/synapseCache" | |
## Synapse ID for the table we'll be writing to | |
TABLEID = "syn6120245" | |
## The recommended way to include a secret in a Lambda script | |
## is to encrypt that secret with the AWS CLI, like so: | |
## aws kms encrypt --key-id some_key_id \ | |
## --plaintext "This is the secret you want to encrypt" \ | |
## --query CiphertextBlob --output text | |
## So, we've included a Synapse API key encrypted with an AWS: | |
ENCRYPTED_APIKEY = "CiABEQY/uH5qFAKESECRETYACANTBETOOPARANOIDov/KH+emMCM3" | |
print('Loading spam synapse function') | |
def format_datetime(dt): | |
""" | |
Format dates in the way that Synapse tables prefers | |
""" | |
fmt = "{time.year:04}-{time.month:02}-{time.day:02} {time.hour:02}:{time.minute:02}:{time.second:02}.{millisecond:03}" | |
if dt.microsecond >= 999500: | |
dt -= timedelta(microseconds=dt.microsecond) | |
dt += timedelta(seconds=1) | |
return fmt.format(time=dt, millisecond=int(round(dt.microsecond/1000.0))) | |
def lambda_handler(event, context): | |
""" | |
This is the function that Lambda calls. | |
""" | |
print(event) | |
## decrypt Synapse API key via AWS Key Management Service | |
kms = boto3.client('kms') | |
decryption = kms.decrypt(CiphertextBlob=base64.decodestring(ENCRYPTED_APIKEY)) | |
apikey = decryption['Plaintext'] | |
## login using cached credentials | |
syn = synapseclient.Synapse() | |
syn.login('your_synapse_user_name_here', apiKey=apikey) | |
## Add a row to a table using the RowSet method. Using RowSet | |
## is important because it sends the row data to Synapse by | |
## encoded as JSON via the REST API and doesn't use chunked | |
## upload, which doesn't work on Lambda. | |
schema = syn.get(TABLEID) | |
cols = syn.getColumns(schema) | |
new_rows = RowSet(columns=cols, schema=schema, | |
rows=[Row([datetime.now().strftime("%Y-%m-%d"), | |
format_datetime( datetime.now() ), | |
"AWS lambda", | |
"fubar"])]) | |
return syn.store(new_rows) | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment