Skip to content

Instantly share code, notes, and snippets.

@philsch
Last active February 3, 2019 18:50
Show Gist options
  • Save philsch/15b7a1eca0294e72661994dbd9a26a9b to your computer and use it in GitHub Desktop.
Save philsch/15b7a1eca0294e72661994dbd9a26a9b to your computer and use it in GitHub Desktop.
Blogpost: How to update row keys in Google Big Table (main function)
def run():
pipeline_options = PipelineOptions()
pipeline = beam.Pipeline(options=pipeline_options)
options = pipeline_options.view_as(AvroTransformOptions)
steps = (
pipeline
| 'ReadData' >> beam.io.ReadFromAvro(options.input, use_fastavro=True)
| 'Transaform rowkey' >> beam.ParDo(CellTransformDoFn())
| 'WriteData' >> beam.io.WriteToAvro(options.output, BIG_TABLE_SCHEMA, use_fastavro=True))
result = pipeline.run()
result.wait_until_finish()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment