Skip to content

Instantly share code, notes, and snippets.

@Dpananos
Created April 13, 2024 16:09
Show Gist options
  • Save Dpananos/8418a8ad2bb6ad3b5db74981ac772aa5 to your computer and use it in GitHub Desktop.
Save Dpananos/8418a8ad2bb6ad3b5db74981ac772aa5 to your computer and use it in GitHub Desktop.
Confounding example
library(tidyverse)
# Genuine confounding example. Sex confounds relationship between drug and death
set.seed(0)
n <- 1000000
is_male <- rbinom(n, 1, 0.5)
drug <- rbinom(n, 1, 0.6 + 0.3*is_male)
y <- rbinom(n, 1, 0.4 - 0.1*drug + 0.4*is_male)
d <- tibble(drug, is_male, y)
# Pretty up for economy of thought
dpretty <- d %>%
mutate(
drug = factor(drug, labels = c('No Drug','Drug')),
is_male = factor(is_male, labels = c('Female','Male')),
)
# Marginal -- confounded -- estimate
# Drug appears to increase risk of death
dpretty %>%
group_by(drug) %>%
summarise(
risk_of_death = mean(y)
)
# How did this come about?
# We computed P(Y|D) = \Sum_S P(Y|D, S)P(D|S)P(S)
# This is outcome risk x propensity score x weight
propensity <- glm(drug ~ is_male, data=dpretty, family = binomial())
outcome <- glm(y ~ drug*is_male, data=dpretty, family = binomial())
# Not the same as the marginal. What am I doing wrong?
dpretty %>%
distinct(is_male, drug) %>%
modelr::add_predictions(propensity, var = 'propensity', type='response') %>%
modelr::add_predictions(outcome, var = 'outcome_risk', type='response') %>%
group_by(drug) %>%
summarise(
risk_of_death = mean(outcome_risk * propensity)
)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment