ryan-williams/pandas.md

Last active December 3, 2019 00:56

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/ryan-williams/3872a38c1d8670246f10829a004b9095.js"></script>
Save ryan-williams/3872a38c1d8670246f10829a004b9095 to your computer and use it in GitHub Desktop.

Pandas functions for a 2-D histogram of a dataframe: one column's values become the columns, and values become counts of given {row, column} pairs)

Raw

Col1 is the field that will be the "rows" index
Col2 is the column whose values will become the new columns
Col3 is any other column (assuming that other columns are always filled; .count() will only count cells where Col3 has a value)

df \
.groupby(['Col1', 'Col2']) \
[['Col3']] \
.count() \
.reset_index() \
.pivot('Col1', 'Col2', 'Col3')