Skip to content

Instantly share code, notes, and snippets.

@kaz-yos
Created July 19, 2018 13:34
Show Gist options
  • Save kaz-yos/43aa2deb8d30108b2162ef18b7d93cb2 to your computer and use it in GitHub Desktop.
Save kaz-yos/43aa2deb8d30108b2162ef18b7d93cb2 to your computer and use it in GitHub Desktop.
Lightning talk at Partners R User Group Meeting on 2018-07-19
---
title: "tableone (Lightning talk at Partners R User Group Meeting)"
author: "Kazuki Yoshida"
date: "`r format(Sys.time(), '%Y-%m-%d')`"
output: html_document
---
```{r, message = FALSE, tidy = FALSE, echo = F}
## knitr configuration: http://yihui.name/knitr/options#chunk_options
library(knitr)
showMessage <- FALSE
showWarning <- TRUE
set_alias(w = "fig.width", h = "fig.height", res = "results")
opts_chunk$set(comment = "##", error= TRUE, warning = showWarning, message = showMessage,
tidy = FALSE, cache = F, echo = T,
fig.width = 7, fig.height = 7, dev.args = list(family = "sans"))
## for rgl
## knit_hooks$set(rgl = hook_rgl, webgl = hook_webgl)
## for animation
opts_knit$set(animation.fun = hook_ffmpeg_html)
## R configuration
options(width = 116, scipen = 5)
```
## What is this?
This is a material for a lightning talk at the [Partners R User Group](https://rc.partners.org/support-training/training/partners-r-user-group) meeting on 2018-07-19.
## References
- CRAN: https://cran.r-project.org/web/packages/tableone/index.html
- [Introduction](https://cran.r-project.org/web/packages/tableone/vignettes/introduction.html)
- [Using SMD](https://cran.r-project.org/web/packages/tableone/vignettes/smd.html)
## Introduction
tableone is an R package that assist the creation of "Table 1", patient baseline characteristics in a format that is often seen in biomedical journals.
## Load packages
```{r}
library(tidyverse)
library(tableone)
```
## Load data
We load the pbc (primary biliary cirrhosis) dataset from Mayo Clinic.
```{r}
data(pbc, package = "survival")
pbc <- as_data_frame(pbc)
pbc
```
## Overall tables
Invocation of CreateTableOne() with just the data argument shows all variables.
```{r}
CreateTableOne(data = pbc)
```
Some variables are not appropriate as patient baseline characteristics, so let's specify variables via the vars argument. Here we remove patient ID and outcome variables (time and status).
```{r}
dput(names(pbc))
vars <- c("trt", "age", "sex", "ascites", "hepato",
"spiders", "edema", "bili", "chol", "albumin", "copper", "alk.phos",
"ast", "trig", "platelet", "protime", "stage")
CreateTableOne(vars = vars, data = pbc)
```
See ?pbc to better understand the dataset.
```
pbc package:survival R Documentation
Mayo Clinic Primary Biliary Cirrhosis Data
Description:
D This data is from the Mayo Clinic trial in primary biliary
cirrhosis (PBC) of the liver conducted between 1974 and 1984. A
total of 424 PBC patients, referred to Mayo Clinic during that
ten-year interval, met eligibility criteria for the randomized
placebo controlled trial of the drug D-penicillamine. The first
312 cases in the data set participated in the randomized trial and
contain largely complete data. The additional 112 cases did not
participate in the clinical trial, but consented to have basic
measurements recorded and to be followed for survival. Six of
those cases were lost to follow-up shortly after diagnosis, so the
data here are on an additional 106 cases as well as the 312
randomized participants.
A nearly identical data set found in appendix D of Fleming and
Harrington; this version has fewer missing values.
Usage:
pbc
Format:
age: in years
albumin: serum albumin (g/dl)
alk.phos: alkaline phosphotase (U/liter)
ascites: presence of ascites
ast: aspartate aminotransferase, once called SGOT (U/ml)
bili: serum bilirunbin (mg/dl)
chol: serum cholesterol (mg/dl)
copper: urine copper (ug/day)
edema: 0 no edema, 0.5 untreated or successfully treated
1 edema despite diuretic therapy
hepato: presence of hepatomegaly or enlarged liver
id: case number
platelet: platelet count
protime: standardised blood clotting time
sex: m/f
spiders: blood vessel malformations in the skin
stage: histologic stage of disease (needs biopsy)
status: status at endpoint, 0/1/2 for censored, transplant, dead
time: number of days between registration and the earlier of death,
transplantion, or study analysis in July, 1986
trt: 1/2/NA for D-penicillmain, placebo, not randomised
trig: triglycerides (mg/dl)
Source:
T Therneau and P Grambsch (2000), _Modeling Survival Data:
Extending the Cox Model_, Springer-Verlag, New York. ISBN:
0-387-98784-3.
```
We can see some variables are numerically coded categorical variables (ascites, edema, hepato, trt). Here we convert these to factors for correct handling. For binary variables, make the second level the one you want to show the percentage for.
```{r}
pbc <- pbc %>%
mutate(ascites = factor(ascites, levels = c(0,1), labels = c("Absent","Present")),
edema = factor(edema, levels = c(0, 0.5, 1), labels = c("No edema","Untreated or successfully treated","edema despite diuretic therapy")),
hepato = factor(hepato, levels = c(0,1), labels = c("Absent","Present")),
stage = factor(stage),
trt = factor(trt, levels = c(1,2), labels = c("D-penicillmain", "Placebo")))
```
Now these variables are handled better.
```{r}
CreateTableOne(vars = vars, data = pbc)
```
Show missing proportions with the missing option to the print method.
```{r}
print(CreateTableOne(vars = vars, data = pbc), missing = TRUE)
```
## Group-stratified tables
trt is the treatment assignment variable, we should stratify the table with this variable. P-values are added by reasonable default functions.
```{r}
vars <- setdiff(vars, "trt")
CreateTableOne(vars = vars, strata = "trt", data = pbc)
```
Some continuous variables are quite skewed like most biomarkers are. Median [IQR] may be a preferred format for these. Note test column indicates, p-values are based on different function, Wilcoxon test in this case.
```{r}
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"))
```
In the propensity score analysis, standardized mean differences (SMDs) are often preferred. Use the smd argument for
```{r}
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE)
```
## Variable labels
Variable names are typically short and not appropriate for the final version of the table. Use the labelled package to assign variable labels.
```{r}
var_label_list <- list(age = "Age in years",
sex = "Female",
ascites = "Ascites",
hepato = "Hepatomegaly",
spiders = "Spider angioma",
edema = "Edema",
bili = "Serum bilirunbin, mg/dl",
chol = "Serum cholesterol, mg/dl",
copper = "Urine copper ug/day",
stage = "Histologic stage of disease",
trig = "Triglycerides, mg/dl",
albumin = "Serum albumin, g/dl",
alk.phos = "Alkaline phosphotase, U/liter",
ast = "Aspartate aminotransferase, U/ml",
platelet = "Platelet count",
protime = "Prothrombin time in seconds")
labelled::var_label(pbc) <- var_label_list
labelled::var_label(pbc)
```
Let's see the table with variable labels.
```{r}
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE)
```
Once binary categories look OK, we can suppress level indication.
```{r}
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE)
```
## Export to a CSV file
The print method is invisibly returning a matrix object. We can export this to a file. In the console, the formating
via spaces, but we don't need them when exporting. The noSpaces option controls this aspect. If assigning the matrix is all you need, you can turn off printing by the printToggle option.
```{r}
tab1mat <- print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE, noSpaces = TRUE, printToggle = FALSE)
```
Now this is just a matrix of text.
```{r}
tab1mat
```
You can write to a CSV file easily.
```{r}
write.csv(tab1mat, file = "./tab1.csv")
```
--------------------
- Top Page: http://rpubs.com/kaz_yos/
- Github: https://github.com/kaz-yos
- Twitter: https://twitter.com/kaz_yos
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment