To test out the regsem package, we will first simulate data
library(lavaan);library(regsem) sim.mod <- " f1 =~ 1*y1 + 1*y2 + 1*y3+ 1*y4 + 1*y5 f1 ~ 0*x1 + 0*x2 + 0*x3 + 0*x4 + 0*x5 + 0.2*x6 + 0.5*x7 + 0.8*x8 f1~~1*f1" dat.sim = simulateData(sim.mod,sample.nobs=100,seed=12)
And then run the model with lavaan so we can better see the structure.
run.mod <- " f1 =~ NA*y1 + y2 + y3+ y4 + y5 f1 ~ c1*x1 + c2*x2 + c3*x3 + c4*x4 + c5*x5 + c6*x6 + c7*x7 + c8*x8 f1~~1*f1 " lav.out <- sem(run.mod,dat.sim,fixed.x=FALSE) #summary(lav.out) parameterestimates(lav.out)[6:13,] # just look at regressions
One of the difficult pieces in using the cv_regsem function is that the penalty has to be calibrated for each particular problem. In running this code, I first tested the default, but this was too small given that there were some parameters > 0.4. After plotting this initial run, I saw that some parameters weren't penalized enough, therefore, I increased the penalty jump to 0.05 and with 30 different values this tested a model (at a high penalty) that had all estimates as zero. In some cases it isn't necessary to test a sequence of penalties that would set "large" parameters to zero, as either the model could fail to converge then, or there is not uncertainty about those parameters inclusion.
reg.out <- cv_regsem(lav.out,n.lambda=30,type="lasso",jump=0.04, pars_pen=c("c1","c2","c3","c4","c5","c6","c7","c8"))
In specifying this model, we use the parameter labels to tell cv_regsem which of the parameters to penalize. Equivalently, we could have used the extractMatrices function to identify which parameters to penalize.
# not run extractMatrices(lav.out)["A"]
Additionally, we can specify which parameters are penalized by type: "regressions", "loadings", or both c("regressions","loadings"). Note though that this penalizes all parameters of this type. If you desire to penalize a subset of parameters, use either the parameter name or number format for pars_pen.
Next, we can get a summary of the models tested.
Here we can see that we used a large enough penalty to set all parameter estimates to zero. However, the best fitting model came early on, when only a couple parameters were zero.
regsem defaults to using the BIC to choose a final model. This shows up in the final_pars object as well as the lines in the plot. This can be changed with the metric argument.
Understand better what went on with the fit
One thing to check is the "conv" column. This refers to convergence, with 0 meaning the model converged.
And see what the best fitting parameter estimates are.
reg.out$final_pars[1:13] # don't show variances/covariances
In this final model, we set the regression paths for x2,x3, x4, and x5 to zero. We make a mistake for x1, but we also correctly identify x6, x7, and x8 as true paths .Maximum likelihood estimation with lavaan had p-values > 0.05 for the parameters simulated as zero, but also did not identify the true path of 0.2 as significant (< 0.05).
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.