Last active
September 12, 2019 10:19
-
-
Save pavlov99/f638197b5aa72fe3c54a518a56060eed to your computer and use it in GitHub Desktop.
Pandas cross-join
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from functools import reduce | |
def crossjoin(*dfs, **kwargs): | |
"""Calculate a cartesian product of given dataframes. | |
Subsequently join each dataframe using a temporary constant key and then remove it. | |
Also set a MultiIndex - cartesian product of the indices of the input dataframes. | |
See: https://github.com/pydata/pandas/issues/5401 | |
Args: | |
*dfs (pandas.DataFrame): dataframes to be merged | |
**kwargs: merge arguments that will be passed to pd.merge() | |
Returns: | |
pandas.DataFrame: cartesian product of given dataframes | |
""" | |
return reduce( | |
lambda df1, df2: pd.merge(df1.assign(_tmpkey=1), df2.assign(_tmpkey=1), on='_tmpkey', **kwargs), | |
dfs | |
)\ | |
.drop(columns='_tmpkey')\ | |
.set_index(pd.MultiIndex.from_product([df.index for df in dfs])) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment