cv_iEN: Optimizes an iEN model via K-fold cross validation gridsearch...

Description Usage Arguments Value Examples

View source: R/cv_iEN.R

Description

Optimizes an iEN model via K-fold cross validation gridsearch and returns out-of-sample predictions and the associated model meta data.

Usage

1
2
3
4
cv_iEN(X, Y, foldid, alphaGrid, phiGrid, nlambda = 100, lambdas = NULL,
  priors, ncores, eval = c("RMSE", "RSS", "wilcox", "ROCAUC", "spearman",
  "pearson"), family = c("binomial", "gaussian"), intercept = TRUE,
  standardize = TRUE, center = TRUE)

Arguments

X

Input matrix of dimensions nobs x nfeat where each row is an observation vector.

Y

Response variable. Is continuous vector for family = "gaussian" and categorical (with two levels) for family = "binomial".

foldid

Vector that identifies which observations belong to which fold during K-fold Cross-Validation. foldid must consist of at least three folds for optimization and model estimation to occur.

alphaGrid

Vector of alpha values for model optimization.

phiGrid

Vector of phi values for model optimization.

nlambda

Lambda values are generated dynamically during cross-validation to avoid any data leak. nlambda determines the number of lambda values to generate.

lambdas

Optional vector of static lambda values.

priors

Continuous values which indicates immune features (columns of X) that are consistent with known biology. Values vary between 0 (low consistency) to 1 (highly consistent) for each immune feature which create the column space of X.

ncores

Number of cores to use during parallel computing of iEN cross-validation results. For optimal use set ncores = length(alphaGrid) * length(phiGrid).

eval

For binomial models evaluations using Wilcoxon P-value and ROCAUC are provided whereas for Gaussian models RMSE, RSS, Pearson P-value, and Spearman P-value are available.

family

Type of regression model, currently only "Binomial" and "Gaussian" are supported

intercept

Indicator for inclusion of regresstion intercept (default=TRUE).

standardize

Indication for X variable standardization prior to model fitting (default=TRUE).

center

Indication for X variable centering during scaling (default=TRUE).

Value

An object of class "iEN" is returned, which is a class composed of results from the K-fold cross validation and meta data about the analysis. The returned information includes:

Out-of-sample predictions from the K-fold cross validation. Evaluation of the out-of-sample predictions as defined by the eval parameter. Coefficients for each out-of-sample regression model, betas. the optimal parameters (alpha, lambda, phi) calculated for each fold of the analysis.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data(test_data)

alphaGrid <- seq(0,1, length.out=2)
phiGrid <- exp(seq(log(1),log(10), length.out=2))
nlambda <- 3
ncores <- 2
eval <- "RSS" 
family <- "gaussian"
intercept <- TRUE
standardize <- TRUE
center <- TRUE

model <- cv_iEN(X, Y, foldid, alphaGrid, phiGrid, nlambda, NULL, priors, ncores, eval, family, intercept, standardize, center)

Teculos/immunological-EN documentation built on Sept. 23, 2020, 11:53 p.m.