Created
August 29, 2022 04:33
-
-
Save webbedfeet/8e03a4d43dcf6e2ae7ae45fb546db136 to your computer and use it in GitHub Desktop.
Set difference of rows of two data frames
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def df_diff(d1, d2): | |
""" | |
df_diff Create a DataFrame containing rows of d1 not in d2 | |
Arguments: | |
d1 -- A data frame | |
d2 -- Another DataFrame which is a subset of d1 | |
Returns: | |
A pandas DataFrame containing rows of d1 that are not in d2 | |
""" | |
df_all = d1.merge(d2.drop_duplicates(), how="left", indicator=True) | |
return df_all[df_all._merge == "left_only"].drop(columns="_merge") | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment