Skip to content

Instantly share code, notes, and snippets.

@ksindi
Created August 20, 2017 18:42
Show Gist options
  • Save ksindi/5d70d3d4248da186896f52171284cebc to your computer and use it in GitHub Desktop.
Save ksindi/5d70d3d4248da186896f52171284cebc to your computer and use it in GitHub Desktop.
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
fields = [
pa.field('column1', pa.string()),
pa.field('column2', pa.int64()),
pa.field('column3', pa.string()),
]
schema = pa.schema(fields)
rows = [
{'column1': 'val1', 'column2': 123, 'column3': ''},
{'column1': 'val2', 'column2': 234, 'column3': ''},
{'column1': 'val3', 'column2': 345, 'column3': ''},
]
writer = pq.ParquetWriter('table.parquet', schema)
df = pd.DataFrame(rows)
pa_table = pa.Table.from_pandas(df, schema=schema, preserve_index=False)
writer.write_table(pa_table)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment