Skip to content

Instantly share code, notes, and snippets.

@UranusSeven
Created April 13, 2023 08:40
Show Gist options
  • Save UranusSeven/ff6ef4c43e60957d76dbdb6e6125ecdd to your computer and use it in GitHub Desktop.
Save UranusSeven/ff6ef4c43e60957d76dbdb6e6125ecdd to your computer and use it in GitHub Desktop.
Scale out groupby rolling using Xorbits

Given a dataframe,

            stock_id   returns
date
2015-01-03  bac_i.us  0.000000
2015-01-03    glt.us  0.000000
2015-01-03    lpx.us  0.000000
2015-01-03    rbc.us  0.000000
2015-01-03   clnt.us  0.000000
...              ...       ...
2015-12-31    atu.us -0.014771
2015-12-31    jhy.us -0.008246
2015-12-31    xco.us  0.148148
2015-12-31    mik.us  0.000452
2015-12-31    apf.us  0.009147
[277359 rows x 2 columns]

In pandas, to calcuate the rolling mean of each stock,

df.to_pandas().groupby('stock_id').rolling(window=30).mean()
                      returns
stock_id date
aaap.us  2015-11-12       NaN
         2015-11-13       NaN
         2015-11-14       NaN
         2015-11-15       NaN
         2015-11-16       NaN
...                       ...
zixi.us  2015-12-27 -0.002902
         2015-12-28 -0.002965
         2015-12-29 -0.002838
         2015-12-30 -0.002584
         2015-12-31 -0.003644

[277359 rows x 1 columns]

In Xorbits, since the groupby.rolling is currently under development, you can do,

df.groupby('stock_id', group_keys=True).apply(lambda x: x.rolling(window=30).mean())

The result is exactly the same,

                      returns
stock_id date
aaap.us  2015-11-12       NaN
         2015-11-13       NaN
         2015-11-14       NaN
         2015-11-15       NaN
         2015-11-16       NaN
...                       ...
zixi.us  2015-12-27 -0.002902
         2015-12-28 -0.002965
         2015-12-29 -0.002838
         2015-12-30 -0.002584
         2015-12-31 -0.003644

[277359 rows x 1 columns]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment