RCVLassoPar: Function to perform the random split analysis

Description Usage Arguments Value

View source: R/model_functions.R

Description

This function perform the random split analysis method to estimate the optimal lambda shrinkage parameter for the LASSO model

Usage

1
2
3
RCVLassoPar(y, x, lambda, n.splits = 25, train.prop = 0.9,
  measure = "MSE", intercept = TRUE, type.solution = "lci", nCores = 20,
  th = 1.96)

Arguments

y

is the vector of response variables with same length of the number of samples

x

is the matrix of the input dataset with samples on the rows and features on the columns

lambda

is a numeric vector of lambda value to be used in the LASSO parameter selection step

n.splits

is the number of random split to be performed. Default value is 25

train.prop

is the percentage of samples in the dataset to be used as training set. Default value is 0.9

measure

is the measure used to perform the choice of the optimal lambda value. Possible values are MSE and R2. Default value is MSE

intercept

is a boolean valus indicating if we want to fit or not the intercept. Default valuw is TRUE

type.solution

is a string indicating the type of solution to compute. Possible values are: min, uci and lci; if standard min or max of the average CV function. if lci, take the the most parsimonous solution within the tot percentage of confidence bands around the standard solution. if uci a less partimosious solution within the tot percentage of confidence bands around the standard solution is selected

nCores

is the number of cores to be used

th

is the size of the confidence interval. Default value is 1.96

Value

an object of class RCVLasso containing the following objects:

cv

list with matrices (of sizes n.slipts x n.lambda) containing statistics for each lambda and each splitting. mse is the matrix of predictive mse. R2 is the matrix of predictive R2 and active.size is the matrix with active beta in the trained model for each lambda and for each split.

lambda

numeric vector of the lambda values taken in input

fitted

predicted values

residuals

differences between predicted and real values

intercept

if the input parameter intercept is TRUE, it is the numeric value of the fitted intercep, otherwise is zero

beta

beta coefficients of the fitted model

opt.lambda

optimal lambda value

idx.support

index of the optimal lambda value

pred.R2

R2 of the optimal model on the test sets

pred.mse

mse of the optimal model on the test sets

mse

mse of the optimal model on the training sets

R2

R2 of the optimal model on the training sets

measure

measure used to compute the optimal solution

fitted_model

model fitted on the whole data


angy89/hyQSAR documentation built on Sept. 24, 2019, 7:31 a.m.