inferCSN.fit: Fit a sparse regression model

View source: R/sparse.regression.R

inferCSN.fitR Documentation

Fit a sparse regression model

Description

Computes the regularization path for the specified loss function and penalty function

Usage

inferCSN.fit(
  x,
  y,
  penalty = "L0",
  algorithm = "CD",
  maxSuppSize = 100,
  crossValidation = FALSE,
  nFolds = 10,
  seed = 1,
  loss = "SquaredError",
  nLambda = 100,
  nGamma = 5,
  gammaMax = 10,
  gammaMin = 1e-04,
  partialSort = TRUE,
  maxIters = 200,
  rtol = 1e-06,
  atol = 1e-09,
  activeSet = TRUE,
  activeSetNum = 3,
  maxSwaps = 100,
  scaleDownFactor = 0.8,
  screenSize = 1000,
  autoLambda = NULL,
  lambdaGrid = list(),
  excludeFirstK = 0,
  intercept = TRUE,
  lows = -Inf,
  highs = Inf
)

Arguments

x

The data matrix

y

The response vector

penalty

The type of regularization. This can take either one of the following choices: "L0" and "L0L2". For high-dimensional and sparse data, such as single-cell sequencing data, "L0L2" is more effective.

algorithm

The type of algorithm used to minimize the objective function. Currently "CD" and "CDPSI" are supported. The CDPSI algorithm may yield better results, but it also increases running time.

maxSuppSize

The number of non-zore coef, this value will affect the final performance. The maximum support size at which to terminate the regularization path. Recommend setting this to a small fraction of min(n,p) (e.g. 0.05 * min(n,p)) as L0 regularization typically selects a small portion of non-zeros.

crossValidation

Check whether cross validation is used.

nFolds

The number of folds for cross-validation.

seed

The seed used in randomly shuffling the data for cross-validation.

loss

The loss function

nLambda

The number of Lambda values to select

nGamma

The number of Gamma values to select

gammaMax

The maximum value of Gamma when using the L0L2 penalty

gammaMin

The minimum value of Gamma when using the L0L2 penalty

partialSort

If TRUE, partial sorting will be used for sorting the coordinates to do greedy cycling. Otherwise, full sorting is used

maxIters

The maximum number of iterations (full cycles) for CD per grid point

rtol

The relative tolerance which decides when to terminate optimization (based on the relative change in the objective between iterations)

atol

The absolute tolerance which decides when to terminate optimization (based on the absolute L2 norm of the residuals)

activeSet

If TRUE, performs active set updates

activeSetNum

The number of consecutive times a support should appear before declaring support stabilization

maxSwaps

The maximum number of swaps used by CDPSI for each grid point

scaleDownFactor

This parameter decides how close the selected Lambda values are

screenSize

The number of coordinates to cycle over when performing initial correlation screening

autoLambda

Ignored parameter. Kept for backwards compatibility

lambdaGrid

A grid of Lambda values to use in computing the regularization path

excludeFirstK

This parameter takes non-negative integers

intercept

If FALSE, no intercept term is included in the model

lows

Lower bounds for coefficients

highs

Upper bounds for coefficients

Value

An S3 object describing the regularization path


inferCSN documentation built on Nov. 2, 2023, 6:27 p.m.