fitRegModels: fitRegModels

View source: R/fitRegModels.R

fitRegModelsR Documentation

fitRegModels

Description

Note: laremm is based on the R package regsem. Because of the early status of laremm, it is recommended to use regsem instead! fitregModels creates a regularized model from a mxModel or ctsem. It then runs this model with multiple penalty values.

Usage

fitRegModels(model, model_type = "ctsem", fitfun = "FIML",
  data_type = "raw", pen_type = "lasso", pen_on = "none",
  selectedDrifts = "none", selectedA = "none", selectedS = "none",
  pen_start = 0, pen_end = 1, pen_stepsize = 0.01,
  fit_index = "BIC", CV = FALSE, Test_Sample = NULL,
  zeroThresh = 0.001, setZero = FALSE, driftexpo = TRUE,
  DRIFT_dt = 1)

Arguments

model

mxModel or ctsem object

model_type

specify the type of model provided: ctsem or mxModel

fitfun

fitfunction to be used in the fitting procedure. Either FML or FIML

data_type

type of data in the model. Either "cov" or "raw"

pen_on

string vector with matrices that should be regularized. Possible are combinations of "A", "S", "DRIFT"

selectedDrifts

drift values to regularize. Possible are "all", "cross", "auto" or providing a matrix of the same size as the drift matrix with ones for every parameter to regularize and 0 for every non-regularized parameter

selectedA

A values to regularize. Possible are "all", or providing a matrix of the same size as the A matrix with ones for every parameter to regularize and 0 for every non-regularized parameter

selectedS

S values to regularize. Possible are "all", or providing a matrix of the same size as the S matrix with ones for every parameter to regularize and 0 for every non-regularized parameter

pen_start

lowest penalty value to evaluate. Recommended: 0

pen_end

highest penalty value to evaluate

pen_stepsize

increse of penalty with each iteration. e.g. if pen_start = 0, pen_end = 1, pen_stepsize = .1, fitRegModels will iterate over pen = 0, pen = .1, pen = .2, ...

fit_index

which fit index should be used to find the best model? Possible are AIC and BIC, CV_m2LL, CV_AIC, CV_BIC

CV

should a cross validation be computed? If TRUE, provide a Test_Sample

Test_Sample

mxData object with test sample data. Has to be of same data_type as the training data set

zeroThresh

threshold for evaluating regularized parameters as zero. Default is .001 similar to regsem

setZero

should parameters below zeroThresh be set to zero in all fit calculations. Default is FALSE, similar to regsem

driftexpo

specifiy if the regularization will be performed on the raw drift matrix or on the exponential of the drift matrix (discrete time parameters)

DRIFT_dt

provide the discrete time points for which the drift will be regularized. A vector with multiple values is possible

penalty_type

so far only "lasso" implemented

Author(s)

Jannik Orzek

Examples

# The following example is taken from the regsem help to demonstrate the equivalence of both methods:

library(lavaan)
library(OpenMx)
# put variables on same scale for regsem
HS <- data.frame(scale(HolzingerSwineford1939[,7:15]))

# define variables:
latent = c("f1")
manifest = c("x1","x2","x3","x4","x5", "x6", "x7", "x8", "x9")

# define paths:
loadings <- mxPath(from = latent, to = manifest, free = c(F,T,T,T,T,T,T,T,T), values = 1)
lcov <- mxPath(from = latent, arrows = 2, free = T, values = 1)
lmanif <- mxPath(from = manifest, arrows =2 , free =T, values = 1)

# define model:
myModel <- mxModel(name = "myModel", latentVars = latent, manifestVars = manifest, type = "RAM",
                   mxData(observed = HS, type = "raw"), loadings, lcov, lmanif,
                   mxPath(from = "one", to = manifest, free = T)
)

fit_myModel <- mxRun(myModel)
summary(fit_myModel)

# create regularized model:

selectedA <- matrix(0, ncol = ncol(fit_myModel$A$values), nrow = nrow(fit_myModel$A$values))
selectedA[c(2,3,7,8,9),10] <-1


reg_model <- fitRegModels(model = fit_myModel, model_type = "mxModel", fitfun = "FIML",
                          pen_on = "A", selectedA = selectedA,
                          pen_start = 0, pen_end = .05, pen_stepsize = .01
                          )
summary(reg_model)
reg_model$`fit measures`

### use laremm in ctsem ####
library(ctsem)

set.seed(12)

## define the population model:

# set the drift matrix. Note that drift eta_1_eta2 is set to equal 0 in the population.
ct_drift <- matrix(c(-.3,.2,0,-.5), ncol = 2)

generatingModel<-ctModel(Tpoints=10,n.latent=2,n.TDpred=0,n.TIpred=0,n.manifest=2,
                         MANIFESTVAR=diag(0,2),
                         LAMBDA=diag(1,2),
                         DRIFT=ct_drift,
                         DIFFUSION=matrix(c(.5,0,0,.5),2),
                         CINT=matrix(c(0,0),nrow=2),
                         T0MEANS=matrix(0,ncol=1,nrow=2),
                         T0VAR=diag(1,2))

# simulate a training data set
traindata <- ctGenerate(generatingModel,n.subjects = 100)

## Build the analysis model. Note that drift eta1_eta2 is freely estimated
# although it is 0 in the population.

myModel <- ctModel(Tpoints=10,n.latent=2,n.TDpred=0,n.TIpred=0,n.manifest=2,
                   LAMBDA=diag(1,2),
                   MANIFESTVAR=diag(0,2),
                   CINT=matrix(c(0,0),nrow=2),
                   DIFFUSION=matrix(c('eta1_eta1',0,0,'eta2_eta2'),2),
                   T0MEANS=matrix(0,ncol=1,nrow=2),
                   T0VAR=diag(1,2))

# fit the model using ctsem:
fit_myModel <- ctFit(traindata, myModel)
fit_myModel$mxobj$DRIFT$values

# regularize the model:
library(laremm)

# start regularization:
reg_myModel <- fitRegModels(model = fit_myModel, model_type = "ctsem",
                            fitfun = "FIML",data_type = "raw",
                            pen_on = "DRIFT", selectedDrifts = "cross",
                            pen_start = 0, pen_end = 1, pen_stepsize = .1)

# show the best value for penalty term (tuning parameter):
reg_myModel$`best penalty`

# show summary of parameters:
summary(reg_myModel)

#### additional Cross - validation: #####
set.seed(15)
#simulate new dataset from same population:
testdata <- ctGenerate(generatingModel,n.subjects = 100)
# note: ctsem renames the rows and columns when fitting a model. To get
# the right names, we fit the ctsem model with the new dataset and then extract
# the dataset, where rows and columns have now been renamed

fit_myModel <- ctFit(testdata, myModel, useOptimizer = F)
testdata <- fit_myModel$mxobj$data

# fit models with cross-validation:
cv_reg_myModel <- fitRegModels(model = fit_myModel, model_type = "ctsem",
                               fitfun = "FIML",data_type = "raw",pen_on = "DRIFT",
                               selectedDrifts = "cross", pen_start = 0,
                               pen_end = 1, pen_stepsize = .1, CV = TRUE,
                               Test_Sample = testdata, fit_index = "CV_BIC")
# show the summary:
summary(cv_reg_myModel)


jhorzek/laremm documentation built on Sept. 16, 2022, 12:06 p.m.