Last active
January 13, 2021 14:22
-
-
Save justinhchae/6f76c7ee886da34803d36a415be3452a to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
gitcsv = 'https://raw.githubusercontent.com/justinhchae/medium/main/bools.csv' | |
df = pd.read_csv(gitcsv) | |
# some columns that are supposed to be bool | |
cols = ['flag1', 'flag2', 'flag3'] | |
# use np.where to find and match, then replace | |
# this says: Where the dataframe is null, replace with pd.NA, | |
# else, where equal to 1, replace with True, else, the original value | |
df[cols] = np.where(df[cols].isnull(), pd.NA, | |
np.where(df[cols]==1., True, df[cols])) | |
# lastly, use boolean instead of bool | |
# This is the difference between 'regular' bool and the boolean array | |
df[cols] = df[cols].astype('boolean') | |
print(df.head()) | |
print(df['flag1'].unique()) | |
""" boolean with nullables | |
category flag1 flag2 flag3 | |
0 d False <NA> <NA> | |
1 d True <NA> <NA> | |
2 c False <NA> <NA> | |
3 b False <NA> <NA> | |
4 b False <NA> <NA> | |
<BooleanArray> | |
[False, True] | |
Length: 2, dtype: boolean | |
""" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment