Skip to content

Instantly share code, notes, and snippets.

@Abhayparashar31
Created October 25, 2020 22:28
Show Gist options
  • Save Abhayparashar31/ba4e18cdd170b00d793cf7e9ccb32f85 to your computer and use it in GitHub Desktop.
Save Abhayparashar31/ba4e18cdd170b00d793cf7e9ccb32f85 to your computer and use it in GitHub Desktop.
def impute_nan(df,variable):
df[variable+"_random"]=df[variable]
##It will have the random sample to fill the na
random_sample=df[variable].dropna().sample(df[variable].isnull().sum(),random_state=0)
##pandas need to have same index in order to merge the dataset
random_sample.index=df[df[variable].isnull()].index #replace random_sample index with NaN values index
#replace where NaN are there
df.loc[df[variable].isnull(),variable+'_random']=random_sample
col=variable+"_random"
df = df.drop(col,axis=1)
impute_nan(df,"Age")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment