Skip to content

Instantly share code, notes, and snippets.

@capooti
Last active August 29, 2015 14:01
Show Gist options
  • Save capooti/97f2a0b2950c81711deb to your computer and use it in GitHub Desktop.
Save capooti/97f2a0b2950c81711deb to your computer and use it in GitHub Desktop.
Create the most frequent value lists for a grouped field in a csv file with pandas
from pandas import read_csv
df = read_csv(open('zimbabwe.csv'))
columns = ('apr_09','jul_09','sep_09','jan_10','apr_10','jul_10','dec_10','apr_11','jul_11','sep_11','dec_11','mar_12','jul_12','Sep-12','Dec-12','feb_mar13','Jul-13','aug_sep13')
group_field = 'DISTRICT'
joined = df[['DISTRICTPC', group_field]].drop_duplicates()
for c in columns:
sdf = df[[group_field, c]]
result = sdf.groupby([group_field]).agg(lambda x:x.value_counts().index[0])
joined = joined.join(result, on=group_field)
print joined
joined.to_csv('result.csv')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment