Created
January 15, 2017 11:55
-
-
Save LouisAmon/300b4a906a6d25a7fb5d2c4d174d242e to your computer and use it in GitHub Desktop.
Read Avro file from Pandas
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas | |
import fastavro | |
def avro_df(filepath, encoding): | |
# Open file stream | |
with open(filepath, encoding) as fp: | |
# Configure Avro reader | |
reader = fastavro.reader(fp) | |
# Load records in memory | |
records = [r for r in reader] | |
# Populate pandas.DataFrame with records | |
df = pandas.DataFrame.from_records(records) | |
# Return created DataFrame | |
return df |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It looks like you can save a line of code and avoid temporarily duplicating the data in memory by passing the
reader
iterable directly tofrom_records
rather than loading it into a list first.