Skip to content

Instantly share code, notes, and snippets.

@AayushSameerShah
Created June 21, 2021 12:13
Show Gist options
  • Save AayushSameerShah/137bca842d406720e9d03bd51d1e00c0 to your computer and use it in GitHub Desktop.
Save AayushSameerShah/137bca842d406720e9d03bd51d1e00c0 to your computer and use it in GitHub Desktop.
When you have multiple columns to groupby and your fingers are tempted to use all column names at once in groupby syntax, don't do that. Do this instead.
# Define a function in which the group will be passed
def get_nlargest(group):
# That group's group will be created (not complex)
group = group.groupby('key2').value_col.sum()
return group.nlargest(5)
# X Instead of doing this ↓
df.groupby(['key1', 'key2').apply(lamda group: group...)
# Do this ↓
df.groupby('key1').apply(get_nlargest)
'''Just pass one key and the again group inside of the function'''
@AayushSameerShah
Copy link
Author

Got from the pandas book, 5. Putting together, 4th book

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment