Last active
July 23, 2022 14:53
-
-
Save vincentarelbundock/2d5e61e166382cf46d6b705b6b3a7ccb to your computer and use it in GitHub Desktop.
ols/logit marginal effects
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
``` r | |
library(marginaleffects) | |
library(modelsummary) | |
set.seed(20220722) | |
S <- diag(3) | |
S[1,2] <- S[2,1] <- 0.6 | |
S[1,3] <- S[3,1] <- 0.6 | |
data <- MASS::mvrnorm(1000, rep(0, 3), S) |> | |
as.data.frame() |> | |
# make outcome, group, baseline binary | |
transform(outcome = V1 < 0, | |
rx = V2 < 0, | |
baseline = V3 < 0) |> | |
subset(select = c(outcome, rx, baseline)) | |
``` | |
OLS and Logit produce identical results: | |
``` r | |
mod <- list( | |
"glm 1" = glm(outcome ~ rx, data = data, family = binomial()), | |
"ols 1" = lm(outcome ~ rx, data = data), | |
"glm 2" = glm(outcome ~ rx + baseline, data = data, family = binomial()), | |
"ols 2" = lm(outcome ~ rx + baseline, data = data)) | |
mfx <- lapply(mod, marginaleffects) | |
modelsummary(mfx, "markdown") | |
``` | |
| | glm 1 | ols 1 | glm 2 | ols 2 | | |
| :------- | :-------: | :-------: | :-------: | :-------: | | |
| rx | 0.404 | 0.404 | 0.387 | 0.387 | | |
| | (0.029) | (0.029) | (0.027) | (0.026) | | |
| baseline | | | 0.390 | 0.390 | | |
| | | | (0.027) | (0.026) | | |
| Num.Obs. | 1000 | 1000 | 1000 | 1000 | | |
| R2 | | 0.163 | | 0.315 | | |
| R2 Adj. | | 0.162 | | 0.314 | | |
| AIC | 1220.8 | 1277.9 | 1032.4 | 1079.6 | | |
| BIC | 1230.6 | 1292.6 | 1047.1 | 1099.3 | | |
| Log.Lik. | \-608.404 | \-635.940 | \-513.181 | \-535.820 | | |
| F | 153.571 | 194.743 | 111.275 | 229.355 | | |
| RMSE | 0.46 | 0.46 | 0.41 | 0.41 | | |
Therefore, the slope is equal to the effect of a 1 unit change, as is always the case in linear models: | |
``` r | |
marginaleffects(mod[[1]]) |> summary() | |
#> Average marginal effects | |
#> Term Contrast Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 % | |
#> 1 rx TRUE - FALSE 0.4038 0.02891 13.97 < 2.22e-16 0.3471 0.4605 | |
#> | |
#> Model type: glm | |
#> Prediction type: response | |
comparisons(mod[[1]]) |> summary() | |
#> Average contrasts | |
#> Term Contrast Effect Std. Error z value Pr(>|z|) 2.5 % 97.5 % | |
#> 1 rx TRUE - FALSE 0.4038 0.02891 13.97 < 2.22e-16 0.3471 0.4605 | |
#> | |
#> Model type: glm | |
#> Prediction type: response | |
``` |
@mattansb see the discussion above. The model with 1 dummy is saturated because it has intercept and coefficient. The model with 2 dummies and an interaction is saturated because it has an intercept, two coefficients, and the parameter for the interaction. The model with only 2 dummies is not saturated, but as noted by Noah it is consistent, so it should be similar in large samples.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
ols1
andglm1
AME estimates should be exactly equal because those models are saturated;ols2
andglm2
only happen to be equal here because the slopes in linear probability models are consistent for the AME of the corresponding logistic model, but they should not be exactly equal. They would be exactly equal if the models were fully saturated (i.e.,outcome ~ rx * baseline
).