Skip to content

Instantly share code, notes, and snippets.

@jorgerance
Last active February 4, 2020 01:05
Show Gist options
  • Save jorgerance/1466033b6a5ba44c8f5607d3191b063b to your computer and use it in GitHub Desktop.
Save jorgerance/1466033b6a5ba44c8f5607d3191b063b to your computer and use it in GitHub Desktop.
[pandas - group small values] #python #pandas #data #datamanipulation #dataframe #jupyter #notebook
def group_small_values(data, idx, column, percentage):
"""
Groups small dataframe values into a sigle group
- data = source dataframe
- idx = column with qualitative values
- column = column to evaluate
- percentage = group values which value > sum(column)
"""
total_column = data[column].sum()
other_percentage = ((float(total_column) / float(100) * float(percentage)))
data.loc[data[column] <other_percentage, 'group'] = '> ' + percentage
data.loc[data[column] >=other_percentage, 'group'] = data[idx]
return(data.groupby('group', as_index=False).agg({column: "sum"}).sort_values([column]))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment