-
-
Save dsparks/4332698 to your computer and use it in GitHub Desktop.
doInstall <- TRUE | |
toInstall <- c("ggplot2") | |
if(doInstall){install.packages(toInstall, repos = "http://cran.us.r-project.org")} | |
lapply(toInstall, library, character.only = TRUE) | |
ANES <- read.csv("http://www.oberlin.edu/faculty/cdesante/assets/downloads/ANES.csv") | |
ANES <- ANES[ANES$year == 2008, -c(1, 11, 17)] # Limit to just 2008 respondents, | |
head(ANES) # remove some non-helpful variables | |
# Fit several models with the same DV: | |
model1 <- lm(pid7 ~ ideo7 + female + age + south, data = ANES) | |
model2 <- lm(pid7 ~ ideo7 + female + age + female:age, data = ANES) | |
model3 <- lm(pid7 ~ ideo7, data = ANES) # These are just arbitrary examples | |
# Put model estimates into temporary data.frames: | |
model1Frame <- data.frame(Variable = rownames(summary(model1)$coef), | |
Coefficient = summary(model1)$coef[, 1], | |
SE = summary(model1)$coef[, 2], | |
modelName = "South Indicator") | |
model2Frame <- data.frame(Variable = rownames(summary(model2)$coef), | |
Coefficient = summary(model2)$coef[, 1], | |
SE = summary(model2)$coef[, 2], | |
modelName = "Age Interaction") | |
model3Frame <- data.frame(Variable = rownames(summary(model3)$coef), | |
Coefficient = summary(model3)$coef[, 1], | |
SE = summary(model3)$coef[, 2], | |
modelName = "Univariate") | |
# Combine these data.frames | |
allModelFrame <- data.frame(rbind(model1Frame, model2Frame, model3Frame)) # etc. | |
# Specify the width of your confidence intervals | |
interval1 <- -qnorm((1-0.9)/2) # 90% multiplier | |
interval2 <- -qnorm((1-0.95)/2) # 95% multiplier | |
# Plot | |
zp1 <- ggplot(allModelFrame, aes(colour = modelName)) | |
zp1 <- zp1 + geom_hline(yintercept = 0, colour = gray(1/2), lty = 2) | |
zp1 <- zp1 + geom_linerange(aes(x = Variable, ymin = Coefficient - SE*interval1, | |
ymax = Coefficient + SE*interval1), | |
lwd = 1, position = position_dodge(width = 1/2)) | |
zp1 <- zp1 + geom_pointrange(aes(x = Variable, y = Coefficient, ymin = Coefficient - SE*interval2, | |
ymax = Coefficient + SE*interval2), | |
lwd = 1/2, position = position_dodge(width = 1/2), | |
shape = 21, fill = "WHITE") | |
zp1 <- zp1 + coord_flip() + theme_bw() | |
zp1 <- zp1 + ggtitle("Comparing several models") | |
print(zp1) # The trick to these is position_dodge(). |
This is fantastic, thank you. The problem I was previously having was that coefplot {arm} cannot handle lme or nlme objects and I seem to have some issues installing the coefplot2 package in my most recent RStudio version, so I went back to trusty ggplot2. I thought I would share some code that I modified based on your example to make it work for lme....
Fit several [lme] models with the same DV
now including random intercept of "Subject"
model1 <- lme(pid7 ~ ideo7 + female + age + south, random=~1|Subject, data = ANES)
model2 <- lm(pid7 ~ ideo7 + female + age + female:age, random=~1|Subject, data = ANES)
model3 <- lm(pid7 ~ ideo7, random=~1|Subject, data = ANES) # These are just arbitrary examples
Put model estimates into temporary data.frames
The only difference being that you have to specify the table component of the lme summary (tTable instead of coef)
model1Frame <- data.frame(Variable = rownames(summary(model1)$tTable),
Coefficient = summary(model1)$tTable[, 1],
SE = summary(model1)$tTable[, 2],
modelName = "South Indicator")
model2Frame <- data.frame(Variable = rownames(summary(model2)$tTable),
Coefficient = summary(model2)$tTable[, 1],
SE = summary(model2)$tTable[, 2],
modelName = "Age Interaction")
model3Frame <- data.frame(Variable = rownames(summary(model3)$tTable),
Coefficient = summary(model3)$tTable[, 1],
SE = summary(model3)$tTable[, 2],
modelName = "Univariate")
everything else is the same...
A wonderful example. I added position_dodge()
to my existing code to make the comparisons I wanted.
I have a question though regarding the order of the colour
factor levels...
My data frame coefs
:
Estimate se lowerCI upperCI dyads vars
3 0.106478735 0.06169716 -0.01444769 0.22740516 All Kinship
2 -2.591911491 0.20060472 -2.98509674 -2.19872624 All Age difference
4 -0.169305922 0.07252870 -0.31146217 -0.02714967 All Rank difference
31 0.007409558 0.10206631 -0.19264041 0.20745952 Female Kinship
21 -1.944700523 0.25320891 -2.44099000 -1.44841105 Female Age difference
41 -0.220988389 0.10169709 -0.42031468 -0.02166210 Female Rank difference
32 0.137868642 0.06934290 0.00195656 0.27378072 Male Kinship
22 -2.880896180 0.24594286 -3.36294418 -2.39884818 Male Age difference
42 -0.136386916 0.08218800 -0.29747540 0.02470157 Male Rank difference
My code:
g<-ggplot(coefs, aes(vars, Estimate, colour=dyads)) +
theme(axis.title.x =element_text(margin=margin(0,20,0,0)))+
theme(axis.title.x = element_blank())+
geom_hline(yintercept=0, lty=2, lwd=1, colour="grey50") +
geom_errorbar(aes(ymin=lowerCI, ymax=upperCI),
lwd=1, width=0, position = position_dodge(width = 0.5)) +
labs(x ="")+
geom_point(size=4, pch=21,position = position_dodge(width = 0.5)) +
theme(panel.background = element_rect(fill = "white"))
g
When I add coord_flip()
at the end of this code, the lines are no longer ordered by the levels of the factor coefs$dyads
. As you can see in your example also, @dsparks, South Indicator
is the first level of the factor modelName
, but its lines are below the lines of Age Interaction
and Univariate
. Do you know how to custom rearrange these lines such that they appear in the same order as the levels of the colour
factor?
I realize I can simply omit coord_flip()
, save the figure, and rotate it by hand, but rotating the coordinates beforehand is, of course, much more desirable.
Thanks for any insight.
Is it possible to do this with models that were fit using "brms"?
This is awesome! Thanks.
This is awesome, thanks!
Quick question, is there anyway to do the same but have each model in a different column and not stacked? I have 6 binary regression models (they don't have the same dependent variable). It looks messy when I lump them all together. I would prefer to have a column for each model with its respective coefficients in the same plot. Any ideas on how to do this???
Thanks!
@stapial
You can use ggplot2's facet functions to split the models into columns:
zp1<- zp1 + facet_grid(.~modelName)
perfect! Thank you so much @ksorby!
This is exactly what I needed. Thank you!
Thank you so much for this walk-through! Now, just one question: How can I select only certain variables to depict in the figure? Or at least drop the intercept?
Thank you so much for this walk-through! Now, just one question: How can I select only certain variables to depict in the figure? Or at least drop the intercept?
I found a solution:
allModelFrame <- subset(allModelFrame, allModelFrame$Variable == "variable1" |
allModelFrame$Variable == "variable2")
Also, if you want heteroscedastic standard errors, define them for example as so: coff_model1 <- coeftest(model1, vcov = vcovHC)
and plug it into the code:
model1.Frame <- data.frame(Variable = rownames(summary(model1)$coef),
Coefficient = summary(model1)$coef[, 1],
SE = coff_model1[, 2],
modelName = "model1")
Hope this helps.
This is fantastic! I needed something to make coefficient plots for lme4 models where I utilized imputation methods and analyzed over imputed data sets in parallel. This does the trick. Thank you.