Skip to content

Instantly share code, notes, and snippets.

@tcvieira
Forked from GeorgeSeif/missing.py
Created January 21, 2019 14:09
Show Gist options
  • Save tcvieira/79734b36a1cf4ec581eb445f30f86c69 to your computer and use it in GitHub Desktop.
Save tcvieira/79734b36a1cf4ec581eb445f30f86c69 to your computer and use it in GitHub Desktop.
# Filling in NaN values of a particular feature variable
avg_height = 67 # Maybe this is a good number
data["height"] = data["height"].fillna(avg_height)
# Filling in NaN values with a calculated one
avg_height = data["height"].median() # This is probably more accurate
data["height"] = data["height"].fillna(avg_height)
# Dropping rows with missing values
# Here we check which rows of "height" aren't null
# and only keep those
data = data[pd.notnull(data['height'])]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment