Skip to content

Instantly share code, notes, and snippets.

@akj009
Last active January 12, 2020 11:07
Show Gist options
  • Save akj009/ff6459d29b9bee2fbab1d703941315f1 to your computer and use it in GitHub Desktop.
Save akj009/ff6459d29b9bee2fbab1d703941315f1 to your computer and use it in GitHub Desktop.
command to submit beam job with spark runner on yarn
spark2-submit --class com.mptyminds.dataflow.Main \
--master yarn --deploy-mode client \
--driver-memory 2g --executor-memory 1g --executor-cores 1 \
--conf spark.yarn.appMasterEnv.GOOGLE_APPLICATION_CREDENTIALS=<credenial_file_path> \
--conf spark.yarn.executorEnv.GOOGLE_APPLICATION_CREDENTIALS=<credenial_file_path> \
--conf spark.executorEnv.GOOGLE_APPLICATION_CREDENTIALS=<credenial_file_path> \
/path/to/final-shaded.jar \
--hdfsConfiguration=[{\"fs.default.name\":\"hdfs:/host:port\"}] \
--sparkMaster=yarn --streaming=false \
--project=<gcp-project> \
--bigqueryDataset=my_dataset --bigqueryTableName=<my_table_name> \
--tempLocation=gs://temp_bucket --outputFilePath=hdfs://host:port/user/test/ \
--sqlFilePath=<path-to-sql> --runner=SparkRunner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment