TLDR: Create OpenMx models using pipes.
Pipes (|) were introduced in bash in order to facilitate I/O redirection. Together with the UNIX tools, manipulation of data becomes very simple and short to write. In R the concept of the pipe came much more recently, with the magrittr package within the context of dplyr and tidy tools manifesto. One fundamental decision was to name the dplyr tools so that they can be read as verbs, which helps with memorizing what each tool does, in contrast to the less obvious naming of the UNIX tools.
OpenMx is the specialist structural equation modelling tool. It does not follow UNIX principles, it is a big calculator. However, it was programmed in a radically modular way. Over the last months I kept thinking that perhaps I could use this aspect of the software and adapt it to work with R-base pipes. But why?
I have been writing my own scripts in expressive programming, either with Rmarkdown or with Org mode, for some years now. Code chunks fits in best with this scripting style. The code becomes easily auditable, the same source is script, report and presentation, etc.
My modifications should not introduce new bugs to OpenMx, but of course needs testing. It also does not make decisions for the user, so you are still in control of your model. Your model will look like the code below, this is an adaptation from the manual, reducing a model from around 31 to 11 LOC:
mb <- dt |>
group_by(zygosity) |>
mxModel("bivariate Heterogeneity Path Specification", manifestVars = c('X','Y')) |>
mxPath( from=c('X','Y'), arrows=2,
free=T, values=1, lbound=.01 ) |>
mxPath( from="X", to="Y", arrows=2,
free=T, values=.2, lbound=.01) |>
mxPath( from="one", to=selVars, arrows=1,
free=T, values=c(0.1,-0.1), ubound=c(NA,0), lbound=c(0,NA)) |>
wrap()
mb <- mxRun(mb)
summary(mb)
# Data generation for the code above
#Simulate Data
require(MASS)
#group 1
set.seed(200)
xy1 <- mvrnorm (1000, c(0,0), matrix(c(1,.5,.5,1),2,2))
colnames(xy1) <- c("X", 'Y')
#group 2
set.seed(200)
xy2 <- mvrnorm (1000, c(0,0), matrix(c(1,.4,.4,1),2,2))
colnames(xy2) <- c("X", 'Y')
xy1 <- mutate(as.data.frame(xy1), zygosity = "um")
xy2 <- mutate(as.data.frame(xy2), zygosity = "dois")
dt <- rbind(xy1, xy2)
dt <- as_tibble(dt) #this is just to show
The above code is for multiple groups, and the pipe understand the group_by()
function and generates the groups
automatically. Now, you can also send data without grouping, and you will end up with a simple SEM:
model <- twinData |>
dplyr::select(ht1, bmi1, zygosity) |>
mxModel("test", manifestVars = c("ht1", "bmi1"), type = "RAM") |>
mxPath( from="one", to=c("ht1","bmi1"), arrows=1, free=TRUE,
values=c(1,1) ) |>
mxPath( from="ht1", to="bmi1", arrows=1, free=TRUE, values=1) |>
mxPath( from=c("ht1","bmi1"), arrows=2, free=TRUE, values = c(1,1) ) |>
wrap()
In the multiple groups case, you will not be able to pick unique labels for each model, you can either use omxSetParameters()
or use umx alternative utilities like umxPath to have some type of autolabelling.
In the other cases, there is no modification to the model. The only extra thing is the extra wrap()
line.
You can test this using the code below. I don't think this will have time to make a package from this, but the code is small enough to be sourced scripts.
wrap <- function(x,...) {
.x <- substitute(x)
lhs <- .x
rhs <- NULL
dots <- list(...)
while (!identical(lhs[[1]], quote(mxModel))) {
rhs <- c(rhs, lhs[-2])
lhs <- lhs[[2]]
}
res <- deparse(lhs)
full <- NULL
for (i in length(rhs):1) {
full <- paste(full, rhs[i], sep = ",")
}
full <- paste(full, collapse = " ")
res <- paste(res, collapse = " ")
full <- paste(sub(")$", "", res), full, ", wrap = T",")", sep = "")
eval(parse(text = full), parent.frame())
}
Mx <- function (.data = NA, model = NA, ..., manifestVars = NA, latentVars = NA,
remove = FALSE, independent = NA, type = NA, name = NA, wrap = F){
if (wrap) {
dots <- list(...)
new_objects <- list()
if (dplyr::is_grouped_df(.data)) {
levels = dplyr::group_keys(.data)[[1]]
# Criar um modelo para cada grupo
for (i in 1:length(levels)) {
int <- OpenMx::mxModel(name=as.character(levels)[i] , type = "RAM",
manifestVars = manifestVars, latentVars = latentVars,
mxData(as.data.frame(group_split(.data,.keep = F)[[i]]), type = "raw"),
dots)
new_objects <- append(new_objects, int)
}
fit <- mxFitFunctionMultigroup(as.character(levels))
do.call(eval(parse(
text="OpenMx::mxModel")),
c(model=model,name=name, new_objects, fit)
)
} else if ( class(.data) %in% c("data.frame", "data.table", "tbl") ) {
.data = mxData(observed = .data, type = "raw")
do.call(eval(parse(
text="OpenMx::mxModel")),
c(list(model = model, manifestVars = manifestVars, latentVars = latentVars,
remove = remove, independent = independent, type = type, name = name, .data), dots)
)
}
} else {
do.call(eval(parse(
text="OpenMx::mxModel")),
c(list(model = model, manifestVars = manifestVars, latentVars = latentVars,
remove = remove, independent = independent, type = type, name = name), dots)
)
}
}