plsmm_lasso: Fit a high-dimensional PLSMM

View source: R/plsmm-lasso.R

plsmm_lassoR Documentation

Fit a high-dimensional PLSMM

Description

Fits a partial linear semiparametric mixed effects model (PLSMM) via penalized maximum likelihood.

Usage

plsmm_lasso(
  x,
  y,
  series,
  t,
  name_group_var = NULL,
  bases,
  gamma,
  lambda,
  timexgroup,
  criterion,
  nonpara = FALSE,
  cvg_tol = 0.001,
  max_iter = 100,
  verbose = FALSE
)

Arguments

x

A matrix of predictor variables.

y

A continuous vector of response variable.

series

A variable representing different series or groups in the data modeled as a random intercept.

t

A numeric vector indicating the timepoints.

name_group_var

A character string specifying the name of the grouping variable in the x matrix.

bases

A matrix of bases functions.

gamma

The regularization parameter for the nonlinear effect of time.

lambda

The regularization parameter for the fixed effects.

timexgroup

Logical indicating whether to use a time-by-group interaction. If TRUE, each group in name_group_var will have its own estimate of the time effect.

criterion

The information criterion to be computed. Options are "BIC", "BICC", or "EBIC".

nonpara

Logical. If TRUE, the criterion is computed using both the coefficients of the fixed-effects and the coefficients of the nonlinear function. If FALSE, only the coefficients of the fixed-effects are used.

cvg_tol

Convergence tolerance for the algorithm.

max_iter

Maximum number of iterations allowed for convergence.

verbose

Logical indicating whether to print convergence details at each iteration. Default is FALSE.

Details

This function fits a PLSMM with a lasso penalty on the fixed effects and the coefficient associated with the bases functions. It uses the Expectation-Maximization (EM) algorithm for estimation. The bases functions represent a nonlinear effect of time.

The model includes a random intercept for each level of the variable specified by series. Additionally, if timexgroup is set to TRUE, the model includes a time-by-group interaction, allowing each group of name_group_var to have its own estimate of the nonlinear function, which can capture group-specific nonlinearities over time. If name_group_var is set to NULL only one nonlinear function for the whole data is being used

The algorithm iteratively updates the estimates until convergence or until the maximum number of iterations is reached.

Value

A list containing the following components:

lasso_output

A list with the fitted values for the fixed effect and nonlinear effect. The estimated coeffcients for the fixed effects and nonlinear effect. The indices of the used bases functions.

se

Estimated standard deviation of the residuals.

su

Estimated standard deviation of the random intercept.

out_phi

Data frame containing the estimated individual random intercept.

ni

Number of timepoitns per observations.

hyperparameters

Data frame with lambda and gamma values.

converged

Logical indicating if the algorithm converged.

crit

Value of the selected information criterion.

Examples


set.seed(123)
data_sim <- simulate_group_inter(
  N = 50, n_mvnorm = 3, grouped = TRUE,
  timepoints = 3:5, nonpara_inter = TRUE,
  sample_from = seq(0, 52, 13), 
  cos = FALSE, A_vec = c(1, 1.5)
)
sim <- data_sim$sim
x <- as.matrix(sim[, -1:-3])
y <- sim$y
series <- sim$series
t <- sim$t
bases <- create_bases(t)
lambda <- 0.0046
gamma <- 0.00000001
plsmm_output <- plsmm_lasso(x, y, series, t,
  name_group_var = "group", bases$bases,
  gamma = gamma, lambda = lambda, timexgroup = TRUE,
  criterion = "BIC"
)
# fixed effect coefficients
plsmm_output$lasso_output$theta

# fixed effect fitted values
plsmm_output$lasso_output$x_fit

# nonlinear functions coefficients
plsmm_output$lasso_output$alpha

# nonlinear functions fitted values
plsmm_output$lasso_output$out_f

# standard deviation of residuals
plsmm_output$se

# standard deviation of random intercept
plsmm_output$su

# series specific random intercept
plsmm_output$out_phi

plsmmLasso documentation built on June 22, 2024, 9:35 a.m.