kFoldCV: kFoldCV
In jhorzek/laremm: LASSO Regularization in mxModels

View source: R/kFoldCV.R

kFoldCV

R Documentation

kFoldCV

Description

Note: laremm is based on the R package regsem. Because of the early status of laremm, it is recommended to use regsem instead! kFoldCV uses k fold cross-validation with fitRegModels

Usage

kFoldCV(k = 5, model, model_type = "mxModel", fitfun = "FIML",
  data_type = "raw", pen_type = "lasso", pen_on = "none",
  selectedDrifts = "none", driftexpo = TRUE, selectedA = "none",
  selectedS = "none", pen_start = 0, pen_end = 1,
  pen_stepsize = 0.01, zeroThresh = 0.001, setZero = FALSE)

Arguments

`k`	specifies the number of splits (e.g. k = 5 for 5-fold-CV)
`model`	mxModel with the full data set in model$data. The data set will be split by kFoldCV
`model_type`	specify the type of model provided: only mxModel supported
`fitfun`	fitfunction to be used in the fitting procedure. Currently only FIML implemented
`data_type`	type of data in the model. Only "raw" supported
`pen_on`	string vector with matrices that should be regularized. Possible are combinations of "A", "S", "DRIFT"
`selectedDrifts`	drift values to regularize. Possible are "all", "cross", "auto" or providing a matrix of the same size as the drift matrix with ones for every parameter to regularize and 0 for every non-regularized parameter
`driftexpo`	specifiy if the regularization will be performed on the raw drift matrix or on the exponential of the drift matrix (discrete time parameters)
`selectedA`	A values to regularize. Possible are "all", or providing a matrix of the same size as the A matrix with ones for every parameter to regularize and 0 for every non-regularized parameter
`selectedS`	S values to regularize. Possible are "all", or providing a matrix of the same size as the S matrix with ones for every parameter to regularize and 0 for every non-regularized parameter
`pen_start`	lowest penalty value to evaluate. Recommended: 0
`pen_end`	highest penalty value to evaluate
`pen_stepsize`	increse of penalty with each iteration. e.g. if pen_start = 0, pen_end = 1, pen_stepsize = .1, fitRegModels will iterate over pen = 0, pen = .1, pen = .2, ...
`zeroThresh`	threshold for evaluating regularized parameters as zero. Default is .001 similar to regsem
`setZero`	should parameters below zeroThresh be set to zero in all fit calculations. Default is FALSE, similar to regsem
`penalty_type`	so far only "lasso" implemented
`DRIFT_dt`	provide the discrete time points for which the drift will be regularized. A vector with multiple values is possible

Author(s)

Jannik Orzek

Examples

# The following example is taken from the regsem help to demonstrate the equivalence of both methods:

library(lavaan)
library(OpenMx)
# put variables on same scale for regsem
HS <- data.frame(scale(HolzingerSwineford1939[,7:15]))

# define variables:
latent = c("f1")
manifest = c("x1","x2","x3","x4","x5", "x6", "x7", "x8", "x9")

# define paths:
loadings <- mxPath(from = latent, to = manifest, free = c(F,T,T,T,T,T,T,T,T), values = 1)
lcov <- mxPath(from = latent, arrows = 2, free = T, values = 1)
lmanif <- mxPath(from = manifest, arrows =2 , free =T, values = 1)

# define model:
myModel <- mxModel(name = "myModel", latentVars = latent, manifestVars = manifest, type = "RAM",
                   mxData(observed = HS, type = "raw"), loadings, lcov, lmanif,
                   mxPath(from = "one", to = manifest, free = T)
)

fit_myModel <- mxRun(myModel)
summary(fit_myModel)

# create regularized model:

selectedA <- matrix(0, ncol = ncol(fit_myModel$A$values), nrow = nrow(fit_myModel$A$values))
selectedA[c(2,3,7,8,9),10] <-1


kFolRreg_model <- kFoldCV(model = fit_myModel, model_type = "mxModel", fitfun = "FIML",
                          pen_on = "A", selectedA = selectedA,
                          pen_start = 0, pen_end = .05, pen_stepsize = .01,
                          k = 5
                          )
summary(kFolRreg_model)
# inspect results:
kFolRreg_model$`CV results`

jhorzek/laremm documentation built on Sept. 16, 2022, 12:06 p.m.