tqsar: Random split for glmnet

Description Usage Arguments Value

View source: R/tqsar.R

Description

Computes random split validation for glmnet, produces a plot, and returns a value for lambda

Usage

1
2
3
4
5
6
tqsar(X_train, X_test, y_train, y_test, type.transformation = "none",
  alpha_values = c(0.1, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2), cost = 3,
  bb = 2, type.solution = "lci", measure = "MSE", intercept = TRUE,
  th = 1.96, cor_th = 0.95, zero_th = 0.8, n.splits = 10,
  inner.train.prop = 0.9, nLambda = 100, n.splits.val = 25,
  n.splits.scr = 25, nCores = 10, md_list = c(), gene_list = c())

Arguments

type.transformation

is a string indicating the type of transformation to be performed at the data before fitting the model. It can be one of the following: "none","abs","abs.alpha.power","mul","log.abs","log.abs.alpha.power"; none is the default parameter.

alpha_values

is a numeric vector of alpha value to be used in the abs.alpha.power transformation. Default value is c(0.1,0.25,0.5,0.75,1,1.25,1.5,1.75,2)

cost

is a numeric constant used when type.transformation = mul.

bb

is the base of the logarithm used when type.transformation = log.abs or log.abs.alpha.power

type.solution

is a string indicating the type of solution to compute. Possible values are: min, uci and lci; if standard min or max of the average CV function. if uci, take the the most parsimonous solution within the tot percentage of confidence bands around the standard solution. if lci a less partimosious solution within the tot percentage of confidence bands around the standard solution is selected

measure

is the measure used to perform the choice of the optimal lambda value. Possible values are MSE and R2. Default value is MSE

intercept

is a boolean valus indicating if we want to fit or not the intercept. Default valuw is TRUE

th

is the size of the confidence interval. Default value is 1.96

cor_th

is the maximum accepted correlation between couple of features. Default value is 0.95

zero_th

is the maximum percentage of zeros accepted in a feature. Default value is 0.8

n.splits

is the number of random split to be performed. Default value is 25

inner.train.prop

is the percentage of samples from the train test to be used as training set in the random-split method. Default value is 0.9 (90)

nLambda

number of lambda to be tested in the LASSO model

n.splits.val

is the number of random split to compute validation metrics. Default value is 25

n.splits.scr

is the number of random split to perform the y-scrambling test. Default value is 25

nCores

is the number of cores to be used

X

is the dataset matrix with samples on rows and features on columns.

y

is the numeric vector of response variables.

Value

an object of class tqsar containing a list of final models, a list of williams plot object a dataframe with all the metrices computed for the different models, and a list of lambda values used to train the LASSO model.

finalModels

a list with one or more models coming from the RCVLasso function

williams_plots

a list of one or more williams plot objects

Metrics

a dataframe with all the internal and external metrics estimated for every model

lambda

a list of one or more vector of lambda used to tune the LASSO models


angy89/hyQSAR documentation built on Sept. 24, 2019, 7:31 a.m.