Created
June 20, 2015 17:38
-
-
Save CamDavidsonPilon/e7e5bcc4fdde6722cea2 to your computer and use it in GitHub Desktop.
Lifelines categorical variables.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from patsy import dmatrix | |
from lifelines import CoxPHFitter | |
import pandas as pd | |
df = pd.read_csv('/Users/camerondavidson-pilon/Downloads/prostate1.csv') | |
X = dmatrix('age + hg + sz + sg + rx + pf + status1 + dtime', df, return_type='dataframe') | |
print X.head() | |
""" | |
Notice patsy has removed the redundant variables: `0.2 mg estrogen` and `in bed < 50% daytime`. This is what R does too. | |
Patsy has introduced an Intercept column, though. We don't want this. | |
""" | |
del X['Intercept'] | |
cp = CoxPHFitter(normalize=False) | |
cp.fit(X, 'dtime', event_col='status1') | |
cp.print_summary() # values close to R. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment