Skip to content

Instantly share code, notes, and snippets.

@chucknado
Last active November 23, 2020 07:41
Show Gist options
  • Select an option

  • Save chucknado/fc39d82352d8eb9323a9 to your computer and use it in GitHub Desktop.

Select an option

Save chucknado/fc39d82352d8eb9323a9 to your computer and use it in GitHub Desktop.
Sample script for "Write large data sets in Excel with Python and pandas" at https://support.zendesk.com/hc/en-us/articles/212227138
import dateutil.parser
import pandas as pd
topic = pd.read_pickle('my_serialized_data')
posts_df = pd.DataFrame(topic['posts'], columns=['id', 'title', 'created_at', 'author_id'])
users_df = pd.DataFrame(topic['users'], columns=['id', 'name']).drop_duplicates(subset=['id'])
posts_df['created_at'] = posts_df['created_at'].apply(lambda x: dateutil.parser.parse(x).date())
merged_df = pd.merge(posts_df, users_df, how='left', left_on='author_id', right_on='id')
merged_df.rename(columns={'id_x': 'post_id'}, inplace=True)
merged_df.drop(['id_y', 'author_id'], axis=1, inplace=True)
merged_df.to_excel('topic_posts.xlsx', index=False)
print('Spreadsheet saved.')
@nimicent
Copy link
Copy Markdown

nimicent commented Jul 8, 2017

Love this, thank you for that tutorial!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment