jebyrnes · December 22, 2017 19:34
diff --git a/brms_sem.Rmd b/brms_sem.Rmd
 ---
 title: "Bayesian SEM with BRMS"
 author: "Jarrett Byrnes"
 date: "12/20/2017"
 output: html_document
 ---

 ```{r setup, include=FALSE}
 #http://sites.tufts.edu/emotiononthebrain/2017/08/12/blog-posting-from-r-markdown-to-wordpress/
 knitr::opts_chunk$set(echo = TRUE)
 library(brms)
 ```


 ## Background

 So, it looks like `brms` version 2.0 implements multivariate responses - and hence piecewise Structural Equation Modeling in a Bayesian framework. After [https://github.com/paul-buerkner/brms/issues/303](some discussion), it became clear that while one cannot do tests of D-separation, WAIC, LOO, etc., are all available.  Let's take a peak with a familiar example.

 As a note, `blavaan` also implements Bayesian SEM, but `brms` is so gorram flexible with model forms, that I think it's an excellent way to approach the problem that gives a lot more breathing room. And it's in STAN, not JAGS!

 ## The Keeley Data
 OK, the venerable Keeley et al. 2005 and extensions in Grace and Keeley 2006. The teaching example that a number of us use is to look at how fire severity mediates plan species richness via cover.

 ```{r load_data, message=FALSE}
 ### Getting the data from piecewiseSEM
 #install.packages("jslefche/[email protected]") #for the keeley data set
 library(piecewiseSEM) 
 data(keeley)

 ##Otherwise, on the web
 #keeley <- read.csv("http://byrneslab.net/classes/lavaan_materials/Keeley_rawdata_select4.csv ")
 ```

 ```{r graph, echo=FALSE, message=FALSE}
 library(igraph)

 nodes <- data.frame(nodes = c("firesev", "cover", "rich"))


 paths <- data.frame(source = c("firesev", "cover", "firesev"),
                    target = c("rich", "rich", "cover"))
 net_unique <- graph_from_data_frame(d=paths, vertices=nodes, directed=T) 

 layout <- matrix(c(1,1,
                          2,0,
                          3,1), byrow=TRUE, ncol=2)


 plot(net_unique,  vertex.size = 58,    
     vertex.shape = "square",
     layout = layout)

 ```

 Using covariance based modeling, we would fit the following:

 ```{r lavaan, message=FALSE, warning=FALSE}
 library(lavaan)

 k_mod <- "
  rich ~ firesev + cover
  cover ~ firesev"

 k_fit_lavaan <- sem(k_mod, data = keeley)
 ```

 ```{r plot_lavaan, echo=FALSE}
 paths$lavaan_coefs <- round(coef(k_fit_lavaan)[1:3],2)

 plot(graph_from_data_frame(d=paths, vertices=nodes, directed=T),
      vertex.size = 58,    
     vertex.shape = "square",
     layout = layout,
     edge.label = paths$lavaan_coefs,
     edge.label.x = c(0,1,-1),
     edge.label.y = c(1.25,0, 0))
 ```

 We can fit this in piecewiseSEM as well
 ```{r piecewiseFit, message=FALSE}
 k_fit_psem <- psem(
  lm(rich ~ firesev + cover, data=keeley),
  lm(cover ~ firesev, data=keeley),
  data = keeley
 )
 ```

 ```{r plot_coefs, echo=FALSE}
 paths$psem_coefs <- round(coefs(k_fit_psem)$Estimate,2)


 plot(graph_from_data_frame(d=paths, vertices=nodes, directed=T),
      vertex.size = 58,    
     vertex.shape = "square",
     layout = layout,
     edge.label = paths$psem_coefs,
     edge.label.x = c(0,1,-1),
     edge.label.y = c(1.25,0, 0))
 ```

 Yep, the same.

 ## brms and SEM

 In the new `brms` you can build these models with `mvbrmsformula` or just adding multiple `brmsformula` objects together. We'll use `set_rescor(FALSE)` to not model the correlation between response variables (but could to represent residual correlations, I think!) So, here we have

 ```{r brms, cache=TRUE, results="hide"}
 library(brms)

 rich_mod <- bf(rich ~ firesev + cover)
 cover_mod <- bf(cover ~ firesev)

 k_fit_brms <- brm(rich_mod +
                  cover_mod + 
                  set_rescor(FALSE), 
                data=keeley,
                cores=4, chains = 2)

 ````

 ```{r plot_coefs_brms, echo=FALSE}
 paths$brms_coefs <- round(fixef(k_fit_brms)[3:5],2)


 plot(graph_from_data_frame(d=paths, vertices=nodes, directed=T),
      vertex.size = 58,    
     vertex.shape = "square",
     layout = layout,
     edge.label = paths$brms_coefs,
     edge.label.x = c(0,1,-1),
     edge.label.y = c(1.25,0, 0))
 ```

 Looks the same! How did the model do?

 ```{r plot_brms}
 plot(k_fit_brms)
 ```

 One thing we can do is look at the WAIC or LOO which we can see is the sum of its component pieces. Let's start with fitting those pieces.

 ```{r fit_pieces, results="hide", cache=TRUE}
 rich_fit <- brm(rich_mod,
                data=keeley,
                cores=2, chains = 2)
 cover_fit <- brm(cover_mod,
                data=keeley,
                cores=2, chains = 2)
 ````

 Now, let's look at the whole and the parts
 ```{r waic_pieces}
 WAIC(k_fit_brms)

 WAIC(rich_fit)
 WAIC(cover_fit)
 ```

 Nice!


 ## Comparison across all three forms

 Broadly, the figures above show simular results. But let's dig deeper. In `lavaan` we get
 ```{r summary_lavaan}
 summary(k_fit_lavaan)
 ```

 We can see the `rich ~ cover` path has 0=0.051. Which, you know, some people are OK with, some might wonder. `piecewiseSEM` produces broadly similar results.

 ```{r piecewise_summary, warning=FALSE}
 summary(k_fit_psem, intercept=TRUE)
 ```

 Note that p-value goes up! Huh. We also now have intercepts, which we can compare to...

 ```{r brms_summary, warning=FALSE}
 summary(k_fit_brms)
 ```

 Nice effective sample sizes and convergence. Note the `rich_cover` credible interval is pretty big, now, and we can see how it overlaps 0. So, broadly, similar results again!

 OK, what can we do with this?From an SEM framework, we want to compare this to an alternate model. I've seen suggested that we can calculate posterior p-values for the basis set and go from there for a traditional D-sep test, but that really does mix frequentist and Bayesian paradigms, so, I'm trying to ponder that - although evaluating each independence claim isn't a bad idea. Might need to write some code to do that.

 Model comparison is a snap, however.

 ### Mediation in BRMS
 Let's say we have the following alternate model where fire severity's effect is fully mediated via cover.

 ```{r alt_graph, echo=FALSE}
 library(igraph)

 nodes <- data.frame(nodes = c("firesev", "cover", "rich"))


 paths_fullmed <- data.frame(source = c("firesev",  "cover"),
                    target = c("cover",  "rich"))


 plot(graph_from_data_frame(d=paths_fullmed, vertices=nodes, directed=T),
      vertex.size = 58,    
     vertex.shape = "square",
     layout = layout)
 ```

 OK, let's fit this model


 ```{r brms_fullmed, cache=TRUE, results="hide"}
 rich_mod_fullmed <- bf(rich ~  cover)

 fit_brms_fullmed <- brm(rich_mod_fullmed +
                  cover_mod + 
                  set_rescor(FALSE), 
                data=keeley,
                cores=4, chains = 2)

 ````

 First, if we wanted to evaluate the one missing claim, we **could** have looked at the `rich ~ firesev` relationship in the full model.

 ```{r coefs_full}
 fixef(rich_fit)
 ```

 We can see the 95% Credible Interfal doesn't overlap 0. First clue that we need it.

 Second, let's look at WAIC and LOO

 ```{r waic_loo_compare}
 WAIC(k_fit_brms, fit_brms_fullmed)

 LOO(k_fit_brms, fit_brms_fullmed)
 ```

 So, either way, both say that the partial mediation model is better, but the difference between the two overlaps. This is subtly different from AIC results from frequentist approaches - not sure what's going on here, but super interesting. I'd still say that credible interval from the basis set suggests that the partial mediation model is the way to go, however, if this was a confirmatory model analysis.

 ## Prediction

 This is one of the things I'm more excited about here. Let's say we start with a fire severity value of 5. How does this predict plant species richness? How do we bring uncertainty to the picture? There's no automation here, so we have to calculate paths out ourselved - something I'm hoping to handle in `piecewiseSEM` and cam maybe port over here.

 We start simply by getting cover. Note, non-terminal endogenous variables need to have an `NA` as their values.

 ```{r predict_simple}
 newdata = data.frame(firesev = 5, cover=NA)

 cover_pred <- fitted(k_fit_brms, newdata=newdata,
             resp = "cover", nsamples = 1000, 
             summary = FALSE)

 median(cover_pred)
 posterior_interval(cover_pred)
 ```

 Boom! A prediction and the CL given coefficient error. OK, let's let this go to richness. Note, we're going to `expand.grid` to incorporate uncertainty, and so when we get a matrix back, we need to group by initial exogenous variables. In this case, we only have 1, so we can just `as.vector` the whole thing, but, for more complex newdata, you'll need to do some more processing. 

 OK, so, let's get the terminal endogenous variable. Note, the `as.matrix(diag())` bit is to keep the number of simulation consistent. Otherwise we're just kicking up our number of sims quite a bit and we muck about with how we handle uncertainty. 

 ```{r predict_cover}
 newdata2 <- expand.grid(firesev = newdata$firesev, cover = as.vector(cover_pred))

 rich_pred <- fitted(k_fit_brms, newdata=newdata2,
             resp = "rich", nsamples = 1000, 
             summary = FALSE)

 #to minimize excess uncertainty
 rich_pred <- as.matrix(diag(rich_pred))

 #visualize
 plot(density(as.vector(rich_pred)))
 median(as.vector(rich_pred))
 posterior_interval(rich_pred)
 ```

 And there we go!

 What about propogating true prediction uncertainty (e.g., other forces influencing cover, and the residual error in richness)? To do so, instead of using `fitted` we would use `predict` and otherwise do exactly the same

 ## Further directions

 This is an amazing starting point. There's a lot more to explore. Is an omnibus Dsep possible/advisable in a Bayesian framework, or are we really more down to just model comparison? Can `brms` fold in latent variables? Or composites? I've been working on multigroup analysis in a piecewise framework, and, the ideas track over here pretty well (lots of 2-way interactions and then some model comparison), so I'm not too worried there. Ditto categorical variables - they fold in here nicely.

 What's perhaps the greatest thing here is that `brms` is SUPER DUPER flexible in terms of model forms and other fun details - correlation structures, random effects, lots of error distributions, and more! It's worth digging through their [https://cran.r-project.org/web/packages/brms/vignettes/](vignettes) to see all you can do or surf on over to their Google Group at https://groups.google.com/forum/#!forum/brms-users for more.
	---
	title: "Bayesian SEM with BRMS"
	author: "Jarrett Byrnes"
	date: "12/20/2017"
	output: html_document
	---

	```{r setup, include=FALSE}
	#http://sites.tufts.edu/emotiononthebrain/2017/08/12/blog-posting-from-r-markdown-to-wordpress/
	knitr::opts_chunk$set(echo = TRUE)
	library(brms)
	```


	## Background

	So, it looks like `brms` version 2.0 implements multivariate responses - and hence piecewise Structural Equation Modeling in a Bayesian framework. After [https://github.com/paul-buerkner/brms/issues/303](some discussion), it became clear that while one cannot do tests of D-separation, WAIC, LOO, etc., are all available. Let's take a peak with a familiar example.

	As a note, `blavaan` also implements Bayesian SEM, but `brms` is so gorram flexible with model forms, that I think it's an excellent way to approach the problem that gives a lot more breathing room. And it's in STAN, not JAGS!

	## The Keeley Data
	OK, the venerable Keeley et al. 2005 and extensions in Grace and Keeley 2006. The teaching example that a number of us use is to look at how fire severity mediates plan species richness via cover.

	```{r load_data, message=FALSE}
	### Getting the data from piecewiseSEM
	#install.packages("jslefche/[email protected]") #for the keeley data set
	library(piecewiseSEM)
	data(keeley)

	##Otherwise, on the web
	#keeley <- read.csv("http://byrneslab.net/classes/lavaan_materials/Keeley_rawdata_select4.csv ")
	```

	```{r graph, echo=FALSE, message=FALSE}
	library(igraph)

	nodes <- data.frame(nodes = c("firesev", "cover", "rich"))


	paths <- data.frame(source = c("firesev", "cover", "firesev"),
	target = c("rich", "rich", "cover"))
	net_unique <- graph_from_data_frame(d=paths, vertices=nodes, directed=T)

	layout <- matrix(c(1,1,
	2,0,
	3,1), byrow=TRUE, ncol=2)


	plot(net_unique, vertex.size = 58,
	vertex.shape = "square",
	layout = layout)

	```

	Using covariance based modeling, we would fit the following:

	```{r lavaan, message=FALSE, warning=FALSE}
	library(lavaan)

	k_mod <- "
	rich ~ firesev + cover
	cover ~ firesev"

	k_fit_lavaan <- sem(k_mod, data = keeley)
	```

	```{r plot_lavaan, echo=FALSE}
	paths$lavaan_coefs <- round(coef(k_fit_lavaan)[1:3],2)

	plot(graph_from_data_frame(d=paths, vertices=nodes, directed=T),
	vertex.size = 58,
	vertex.shape = "square",
	layout = layout,
	edge.label = paths$lavaan_coefs,
	edge.label.x = c(0,1,-1),
	edge.label.y = c(1.25,0, 0))
	```

	We can fit this in piecewiseSEM as well
	```{r piecewiseFit, message=FALSE}
	k_fit_psem <- psem(
	lm(rich ~ firesev + cover, data=keeley),
	lm(cover ~ firesev, data=keeley),
	data = keeley
	)
	```

	```{r plot_coefs, echo=FALSE}
	paths$psem_coefs <- round(coefs(k_fit_psem)$Estimate,2)


	plot(graph_from_data_frame(d=paths, vertices=nodes, directed=T),
	vertex.size = 58,
	vertex.shape = "square",
	layout = layout,
	edge.label = paths$psem_coefs,
	edge.label.x = c(0,1,-1),
	edge.label.y = c(1.25,0, 0))
	```

	Yep, the same.

	## brms and SEM

	In the new `brms` you can build these models with `mvbrmsformula` or just adding multiple `brmsformula` objects together. We'll use `set_rescor(FALSE)` to not model the correlation between response variables (but could to represent residual correlations, I think!) So, here we have

	```{r brms, cache=TRUE, results="hide"}
	library(brms)

	rich_mod <- bf(rich ~ firesev + cover)
	cover_mod <- bf(cover ~ firesev)

	k_fit_brms <- brm(rich_mod +
	cover_mod +
	set_rescor(FALSE),
	data=keeley,
	cores=4, chains = 2)

	````

	```{r plot_coefs_brms, echo=FALSE}
	paths$brms_coefs <- round(fixef(k_fit_brms)[3:5],2)


	plot(graph_from_data_frame(d=paths, vertices=nodes, directed=T),
	vertex.size = 58,
	vertex.shape = "square",
	layout = layout,
	edge.label = paths$brms_coefs,
	edge.label.x = c(0,1,-1),
	edge.label.y = c(1.25,0, 0))
	```

	Looks the same! How did the model do?

	```{r plot_brms}
	plot(k_fit_brms)
	```

	One thing we can do is look at the WAIC or LOO which we can see is the sum of its component pieces. Let's start with fitting those pieces.

	```{r fit_pieces, results="hide", cache=TRUE}
	rich_fit <- brm(rich_mod,
	data=keeley,
	cores=2, chains = 2)
	cover_fit <- brm(cover_mod,
	data=keeley,
	cores=2, chains = 2)
	````

	Now, let's look at the whole and the parts
	```{r waic_pieces}
	WAIC(k_fit_brms)

	WAIC(rich_fit)
	WAIC(cover_fit)
	```

	Nice!


	## Comparison across all three forms

	Broadly, the figures above show simular results. But let's dig deeper. In `lavaan` we get
	```{r summary_lavaan}
	summary(k_fit_lavaan)
	```

	We can see the `rich ~ cover` path has 0=0.051. Which, you know, some people are OK with, some might wonder. `piecewiseSEM` produces broadly similar results.

	```{r piecewise_summary, warning=FALSE}
	summary(k_fit_psem, intercept=TRUE)
	```

	Note that p-value goes up! Huh. We also now have intercepts, which we can compare to...

	```{r brms_summary, warning=FALSE}
	summary(k_fit_brms)
	```

	Nice effective sample sizes and convergence. Note the `rich_cover` credible interval is pretty big, now, and we can see how it overlaps 0. So, broadly, similar results again!

	OK, what can we do with this?From an SEM framework, we want to compare this to an alternate model. I've seen suggested that we can calculate posterior p-values for the basis set and go from there for a traditional D-sep test, but that really does mix frequentist and Bayesian paradigms, so, I'm trying to ponder that - although evaluating each independence claim isn't a bad idea. Might need to write some code to do that.

	Model comparison is a snap, however.

	### Mediation in BRMS
	Let's say we have the following alternate model where fire severity's effect is fully mediated via cover.

	```{r alt_graph, echo=FALSE}
	library(igraph)

	nodes <- data.frame(nodes = c("firesev", "cover", "rich"))


	paths_fullmed <- data.frame(source = c("firesev", "cover"),
	target = c("cover", "rich"))


	plot(graph_from_data_frame(d=paths_fullmed, vertices=nodes, directed=T),
	vertex.size = 58,
	vertex.shape = "square",
	layout = layout)
	```

	OK, let's fit this model


	```{r brms_fullmed, cache=TRUE, results="hide"}
	rich_mod_fullmed <- bf(rich ~ cover)

	fit_brms_fullmed <- brm(rich_mod_fullmed +
	cover_mod +
	set_rescor(FALSE),
	data=keeley,
	cores=4, chains = 2)

	````

	First, if we wanted to evaluate the one missing claim, we could have looked at the `rich ~ firesev` relationship in the full model.

	```{r coefs_full}
	fixef(rich_fit)
	```

	We can see the 95% Credible Interfal doesn't overlap 0. First clue that we need it.

	Second, let's look at WAIC and LOO

	```{r waic_loo_compare}
	WAIC(k_fit_brms, fit_brms_fullmed)

	LOO(k_fit_brms, fit_brms_fullmed)
	```

	So, either way, both say that the partial mediation model is better, but the difference between the two overlaps. This is subtly different from AIC results from frequentist approaches - not sure what's going on here, but super interesting. I'd still say that credible interval from the basis set suggests that the partial mediation model is the way to go, however, if this was a confirmatory model analysis.

	## Prediction

	This is one of the things I'm more excited about here. Let's say we start with a fire severity value of 5. How does this predict plant species richness? How do we bring uncertainty to the picture? There's no automation here, so we have to calculate paths out ourselved - something I'm hoping to handle in `piecewiseSEM` and cam maybe port over here.

	We start simply by getting cover. Note, non-terminal endogenous variables need to have an `NA` as their values.

	```{r predict_simple}
	newdata = data.frame(firesev = 5, cover=NA)

	cover_pred <- fitted(k_fit_brms, newdata=newdata,
	resp = "cover", nsamples = 1000,
	summary = FALSE)

	median(cover_pred)
	posterior_interval(cover_pred)
	```

	Boom! A prediction and the CL given coefficient error. OK, let's let this go to richness. Note, we're going to `expand.grid` to incorporate uncertainty, and so when we get a matrix back, we need to group by initial exogenous variables. In this case, we only have 1, so we can just `as.vector` the whole thing, but, for more complex newdata, you'll need to do some more processing.

	OK, so, let's get the terminal endogenous variable. Note, the `as.matrix(diag())` bit is to keep the number of simulation consistent. Otherwise we're just kicking up our number of sims quite a bit and we muck about with how we handle uncertainty.

	```{r predict_cover}
	newdata2 <- expand.grid(firesev = newdata$firesev, cover = as.vector(cover_pred))

	rich_pred <- fitted(k_fit_brms, newdata=newdata2,
	resp = "rich", nsamples = 1000,
	summary = FALSE)

	#to minimize excess uncertainty
	rich_pred <- as.matrix(diag(rich_pred))

	#visualize
	plot(density(as.vector(rich_pred)))
	median(as.vector(rich_pred))
	posterior_interval(rich_pred)
	```

	And there we go!

	What about propogating true prediction uncertainty (e.g., other forces influencing cover, and the residual error in richness)? To do so, instead of using `fitted` we would use `predict` and otherwise do exactly the same

	## Further directions

	This is an amazing starting point. There's a lot more to explore. Is an omnibus Dsep possible/advisable in a Bayesian framework, or are we really more down to just model comparison? Can `brms` fold in latent variables? Or composites? I've been working on multigroup analysis in a piecewise framework, and, the ideas track over here pretty well (lots of 2-way interactions and then some model comparison), so I'm not too worried there. Ditto categorical variables - they fold in here nicely.

	What's perhaps the greatest thing here is that `brms` is SUPER DUPER flexible in terms of model forms and other fun details - correlation structures, random effects, lots of error distributions, and more! It's worth digging through their [https://cran.r-project.org/web/packages/brms/vignettes/](vignettes) to see all you can do or surf on over to their Google Group at https://groups.google.com/forum/#!forum/brms-users for more.