Skip to content

Instantly share code, notes, and snippets.

@ottomata
Created February 20, 2018 16:53
Show Gist options
  • Save ottomata/c3b258dc5757a68001ecd485205c43a8 to your computer and use it in GitHub Desktop.
Save ottomata/c3b258dc5757a68001ecd485205c43a8 to your computer and use it in GitHub Desktop.
spark-submit --driver-java-options='-Djsonrefine.log.level=DEBUG' --class org.wikimedia.analytics.refinery.job.refine.JsonRefine ./refinery-job/target/refinery-job-0.0.58-SNAPSHOT.jar --input-base-path /wmf/data/raw/eventlogging --database otto --output-base-path /user/otto/external/event05 --input-regex '.*eventlogging_(.+)/hourly/(\d+)/(\d+)/(\d+)/(\d+)' --input-capture 'table,year,month,day,hour' --table-blacklist '^Edit|ChangesListHighlights$' --ignore-failure-flag --since 2018-02-13T00:00:00 --limit 1 --transform-functions 'org.wikimedia.analytics.refinery.job.refine.deduplicate_eventlogging,org.wikimedia.analytics.refinery.job.refine.geocode_ip'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment