Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save vincentarelbundock/2d5e61e166382cf46d6b705b6b3a7ccb to your computer and use it in GitHub Desktop.
Save vincentarelbundock/2d5e61e166382cf46d6b705b6b3a7ccb to your computer and use it in GitHub Desktop.
ols/logit marginal effects
``` r
library(marginaleffects)
library(modelsummary)
set.seed(20220722)
S <- diag(3)
S[1,2] <- S[2,1] <- 0.6
S[1,3] <- S[3,1] <- 0.6
data <- MASS::mvrnorm(1000, rep(0, 3), S) |>
as.data.frame() |>
# make outcome, group, baseline binary
transform(outcome = V1 < 0,
rx = V2 < 0,
baseline = V3 < 0) |>
subset(select = c(outcome, rx, baseline))
```
OLS and Logit produce identical results:
``` r
mod <- list(
"glm 1" = glm(outcome ~ rx, data = data, family = binomial()),
"ols 1" = lm(outcome ~ rx, data = data),
"glm 2" = glm(outcome ~ rx + baseline, data = data, family = binomial()),
"ols 2" = lm(outcome ~ rx + baseline, data = data))
mfx <- lapply(mod, marginaleffects)
modelsummary(mfx, "markdown")
```
| | glm 1 | ols 1 | glm 2 | ols 2 |
| :------- | :-------: | :-------: | :-------: | :-------: |
| rx | 0.404 | 0.404 | 0.387 | 0.387 |
| | (0.029) | (0.029) | (0.027) | (0.026) |
| baseline | | | 0.390 | 0.390 |
| | | | (0.027) | (0.026) |
| Num.Obs. | 1000 | 1000 | 1000 | 1000 |
| R2 | | 0.163 | | 0.315 |
| R2 Adj. | | 0.162 | | 0.314 |
| AIC | 1220.8 | 1277.9 | 1032.4 | 1079.6 |
| BIC | 1230.6 | 1292.6 | 1047.1 | 1099.3 |
| Log.Lik. | \-608.404 | \-635.940 | \-513.181 | \-535.820 |
| F | 153.571 | 194.743 | 111.275 | 229.355 |
| RMSE | 0.46 | 0.46 | 0.41 | 0.41 |
Therefore, the slope is equal to the effect of a 1 unit change, as is always the case in linear models:
``` r
marginaleffects(mod[[1]]) |> summary()
#> Average marginal effects
#> Term Contrast Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 %
#> 1 rx TRUE - FALSE 0.4038 0.02891 13.97 < 2.22e-16 0.3471 0.4605
#>
#> Model type: glm
#> Prediction type: response
comparisons(mod[[1]]) |> summary()
#> Average contrasts
#> Term Contrast Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 %
#> 1 rx TRUE - FALSE 0.4038 0.02891 13.97 < 2.22e-16 0.3471 0.4605
#>
#> Model type: glm
#> Prediction type: response
```
@vincentarelbundock
Copy link
Author

Right, of course! Should have been a *

Screenshot 2022-07-23 013156

@vincentarelbundock
Copy link
Author

@mattansb see the discussion above. The model with 1 dummy is saturated because it has intercept and coefficient. The model with 2 dummies and an interaction is saturated because it has an intercept, two coefficients, and the parameter for the interaction. The model with only 2 dummies is not saturated, but as noted by Noah it is consistent, so it should be similar in large samples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment