Skip to content

Instantly share code, notes, and snippets.

@hermidalc
Created September 6, 2024 16:06
Show Gist options
  • Save hermidalc/fa87d3c4f65b5d57f5a5b47ffcc0c1c6 to your computer and use it in GitHub Desktop.
Save hermidalc/fa87d3c4f65b5d57f5a5b47ffcc0c1c6 to your computer and use it in GitHub Desktop.
Testing imputation methods on GDC TCGA clinical data
library(missForest)
library(mice)
library(ggplot2)
library(ggmice)
input_df <- gdc_case_meta[
c("project_id", "gender", "age_at_diagnosis", "tumor_stage")
]
input_df$project_id <- factor(input_df$project_id)
input_df$gender <- factor(input_df$gender)
input_df$tumor_stage <- ordered(
input_df$tumor_stage,
levels = c("i", "ii", "iii", "iv")
)
mf <- missForest(input_df, maxiter = 100, verbose = TRUE)
imp_df <- mf$ximp
me <- mice(input_df, m = 1)
ggmice(me, aes(age_at_diagnosis, project_id)) + geom_point()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment