Assume you have a DataFrame as below:
import pandas as pd
import numpy as np
np.random.seed(42)
N = 10
df = pd.DataFrame(
{
"val": np.random.random(size=N),
"ts": np.random.choice(['2017-07-01', '2017-07-02', '2017-07-03'], size=N)
}
)
df['ts'] = pd.to_datetime(df.ts)which looks like so:
ts val
0 2017-07-02 0.374540
1 2017-07-01 0.950714
2 2017-07-02 0.731994
3 2017-07-02 0.598658
4 2017-07-02 0.156019
5 2017-07-02 0.155995
6 2017-07-01 0.058084
7 2017-07-01 0.866176
8 2017-07-02 0.601115
9 2017-07-02 0.708073
We would like to count how many events per day occurred on average:
df.resample('86400s', on='ts').agg('size')
# Or, equivalently:
# df.resample('1d', on='ts').agg('size')This yields the following series:
ts
2017-07-01 3
2017-07-02 7
Freq: 86400S, dtype: int64