Last active
June 15, 2017 01:36
-
-
Save markdanese/d1ddfdb2f618373c719bdb444ca34be9 to your computer and use it in GitHub Desktop.
Simple approach to generating adjusted survival curves
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# load libraries -------------------------------------------------------------------- | |
library(survival) | |
library(data.table) | |
library(magrittr) | |
library(ggplot2) | |
options(stringsAsFactors = FALSE, scipen = 10) | |
# load data (from package) ---------------------------------------------------------- | |
cancer2 <- | |
copy(cancer) %>% | |
setDT() # dataset stored in the survival package for testing | |
cancer3 <- cancer2[complete.cases(cancer2)] # taking complete cases to avoid NA observations on fitted model | |
# run cox model --------------------------------------------------------------------- | |
x <- coxph(Surv(time, status) ~ age + factor(sex) + factor(ph.ecog), data = cancer3) # cox model | |
# function to generate mean survival by factor -------------------------------------- | |
adj_curves <- function(dt, obj, fac){ | |
if(!is.factor(dt[[fac]])) { | |
dt[[fac]] <- factor(dt[[fac]]) | |
} | |
groups <- levels(dt[[fac]]) | |
pred <- survfit(obj, dt) | |
timepoints <- c(0, pred$time) | |
timenames <- paste0("time", timepoints) | |
adj_surv <- as.data.frame(t(pred$surv)) | |
adj_surv <- cbind(time0 = 1, adj_surv) | |
names(adj_surv) <- timenames | |
dt <- cbind(dt, adj_surv) | |
dt2 <- dt[, lapply(.SD, mean), .SDcols = timenames, keyby = fac] | |
graph <- as.data.table(cbind(timepoints, t(dt2[, -1]))) | |
names(graph) <- c("Time", groups) | |
graph <- melt.data.table(data = graph, id.vars = "Time", variable.name = fac, value.name = "Survival") | |
return(graph) | |
} | |
# run function and plot data -------------------------------------------------------- | |
gr <- adj_curves(cancer3, x, "sex") | |
ggplot(gr, aes(x = Time, y = Survival, color = sex)) + | |
geom_step() + | |
scale_y_continuous(limits = c(0, 1)) + | |
theme_bw() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi Mark,
First of great work on implementing this function. I am a little confused as to what type of adjusted survival curve is being generated by your code (and the survminer version which I believe is similar to yours). From my understanding, the above code predicts survival probabilities based on a cox model and then averages the probabilities for groups of interest. So we'd have two curves, say for sex, representing males and females and their respective survival probabilities. However, using this method would still create different covariate distributions for each sex and not truly "adjusted" for each covariate in the model. I think what Terry Therneau's code on this matter in his "Adjusted Survival Curves" vignette was trying to accomplish was different than what you present. I found a good article that outlined the model-based estimation of adjusted survival curves as:
the difference between the averaged predicted probabilities between experimental and control group is an estimate of the adjusted R D(t) for the experimental treatment at the specified times.
The key difference in the above method compared to your code is ensuring that every subject is the same for each covariate distribution except for the "exposure" of interest (i.e. sex in our case). I am not an expert on this issue, so I would appreciate your response and clarifications.
Thanks