Skip to content

Instantly share code, notes, and snippets.

@meddulla
Last active May 2, 2018 23:26
Show Gist options
  • Save meddulla/837228e89d4bb7af04ecc686e6cf4eb6 to your computer and use it in GitHub Desktop.
Save meddulla/837228e89d4bb7af04ecc686e6cf4eb6 to your computer and use it in GitHub Desktop.
impute date with statsmodel.mice
import pandas as pd
import numpy as np
import statsmodels as sm
from statsmodels.imputation import mice
# also see https://pypi.org/project/fancyimpute/
data = pd.read_csv("Howell1.csv")
data.head()
"""
height weight age male
0 151.765 47.825606 63.0 NaN
1 139.700 36.485807 NaN 0.0
2 136.525 31.864838 65.0 0.0
3 156.845 53.041915 41.0 1.0
4 145.415 41.276872 51.0 0.0
"""
imp = mice.MICEData(data)
imp.update_all() # updates all missing values in all columns, use update(colname) for single column
imp.data.head()
"""
height weight age male
0 151.765 47.825606 63.0 1.0
1 139.700 36.485807 73.3 0.0
2 136.525 31.864838 65.0 0.0
3 156.845 53.041915 41.0 1.0
4 145.415 41.276872 51.0 0.0
"""
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment