Skip to content

Instantly share code, notes, and snippets.

@vikramsoni2
Created June 26, 2020 07:57
Show Gist options
  • Save vikramsoni2/0947d39c1eccf7af1f368eaea088b879 to your computer and use it in GitHub Desktop.
Save vikramsoni2/0947d39c1eccf7af1f368eaea088b879 to your computer and use it in GitHub Desktop.
describe pandas dataframe
PERCENTILES = [.005, .01, .025, .05, .10, .20, .25, .50, .75, .80, .90, .95, .975, .99]
def super_describe(_d: pd.DataFrame, percentiles: Iterable[float]=PERCENTILES, missing_only: bool=False) -> pd.DataFrame:
"""
Include counts of missing and of unique values.
"""
if isinstance(_d, pd.Series):
_d = _d.to_frame()
_dd = pd.concat((_d.isnull().sum().rename('missing'), _d.nunique(axis=0).rename('unique'), _d.describe(percentiles=percentiles).T, ), axis=1, sort=False)
return _dd.loc[_dd.missing > 0] if missing_only else _dd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment