Created
July 8, 2021 06:23
-
-
Save hsteinshiromoto/2dbde386fbd820d136995c2154d7aec2 to your computer and use it in GitHub Desktop.
Get the row(s) which have the max value in groups using groupby
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# References: | |
# [1] https://stackoverflow.com/questions/15705630/get-the-rows-which-have-the-max-value-in-groups-using-groupby | |
# Get data | |
import pandas as pd | |
df = pd.DataFrame({'category': ['banana', 'eggs', 'eggs', 'full cream milk', 'full cream milk', 'full cream milk'], | |
'unit_quantity': ['1EA', '100G', '100ML', '100G', '100ML', '1L'], | |
'Count': [5, 22, 1, 5, 1, 38],}, | |
index = [0, 1, 2, 3, 4, 5]) | |
# Get index of the original for which `Count` is max | |
idx = df.groupby(['category'])['Count'].transform(max) == df['Count'] | |
# Mask and show corresponding values | |
df.loc[idx, :] | |
# Join the maximum values | |
df['count_max'] = df.groupby('category')['Count'].transform(max) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment