Created
August 24, 2017 10:56
-
-
Save g-a-d/033c57ec013b9a30e7062d3975159fb8 to your computer and use it in GitHub Desktop.
Adding parameters to a pyspark job in AWS Glue
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from awsglue.transforms import * | |
from awsglue.utils import getResolvedOptions | |
from pyspark.context import SparkContext | |
from awsglue.context import GlueContext | |
from awsglue.job import Job | |
from pyspark.sql.types import * | |
from awsglue.dynamicframe import DynamicFrame | |
## @params: [JOB_NAME, CUSTOM1, CUSTOM2, CUSTOM3] | |
args = getResolvedOptions(sys.argv, ['JOB_NAME', 'CUSTOM1', 'CUSTOM2', 'CUSTOM3']) | |
# use args as usual: | |
print "Custom 1 is {}".format(args['CUSTOM1']) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
NOTE that in order for this to work, the key-value pairs you trigger your job with should be of the form:
--CUSTOM1 value1
--CUSTOM2 value2
Note the double-dash option specifier; this is required by the 'getResolvedOptions' function.
This was non-obvious from the docs.