stackBagg-internal: Algorithm 2: Procedure to obtain optimally the coefficients...

ipcw_ensbaggR Documentation

Algorithm 2: Procedure to obtain optimally the coefficients to be used in Algorithm 1

Description

Obtain predictions

Obtain predictions

Compute weighted Brier Loss function for a single marker or a linear weighted combination of markers

Compute weighted Cross Entropy Loss function for a single marker or a linear weighted combination of markers

Obtain the lambda hyparameter for the LASSO using cross-validation

Internal stackBagg helper functions

Compute the risk of missclassifying an individual using as a marker a single prediction or weighted linear combination of several predictions (1-AUC)

Predictions based on a library of Machine Learning procedures

Library of Machine Learning procedures

Predictions based on those Machine Learning procedures in the library that allow for weights to be specified as an argument of the R function. No bagging occurs. This group of algorithms is denoted as Native Weights

Library of Machine Learning procedures that allows for weights

A grid of values for hyperparameters used in the Real Data Application: InfCareHIV Register. This grid of values isan argument in the tuning parameter function tune_parameter_ml.R

Usage

ipcw_ensbagg(folds, MLprocedures, fmla, tuneparams, tao, B = NULL, A,
  data, xnam, xnam.factor, xnam.cont, xnam.cont.gam, ens.library)

ipcw_genbagg(fmla, tuneparams, MLprocedures, traindata, testdata, B, A,
  xnam, xnam.factor, xnam.cont, xnam.cont.gam, ens.library)

ipcw_brier(par, Z, y, wts)

ipcw_crossentropy(par, Z, y, wts)

tune_lasso(folds, fmla, tao, data, xnam)

optimun_auc_coef(coef_init, lambda, data, Z, tao)

risk_auc(par, lambda, Z, data, tao)

MLprocedures(traindata, testdata, fmla, xnam, xnam.factor, xnam.cont,
  xnam.cont.gam, tuneparams, ens.library, i)

ML_list

MLprocedures_natively(traindata, testdata, fmla, xnam, xnam.factor,
  xnam.cont, xnam.cont.gam, tuneparams)

ML_list_natively

grid_parametersDataHIV(xnam, data, tao)

Arguments

folds

Number of folds

MLprocedures

MLprocedures

fmla

formula object ex. "E ~ x1+x2"

tuneparams

a list of tune parameters for each machine learning procedure

tao

time point of interest

B

number of bootstrap samples

data

a training data set

xnam

all covariates in the model

xnam.factor

categorical variables include in the model

xnam.cont

continous variables include in the model

xnam.cont.gam

continous variables to be included in the smoothing operator gam::s(,df)

ens.library

algorithms in the library

traindata

a training data set

testdata

a test data set

par

a vector of weights. Its length must be equal to the number of predictions included in Z

Z

a matrix that contains the predictions. Each column represents a single marker.

y

vector of response variable (binary).

wts

IPC weights

coef_init

starting values for the coefficients

lambda

penalization term. It is a positive scalar.

i

sample selected by bootstrap

fmla

formula object ex. "E ~ x1+x2"

tuneparams

a list of tune parameters for each machine learning procedure

MLprocedures

MLprocedures

B

number of bootstrap samples

xnam

all covariates in the model

xnam.factor

categorical variables include in the model

xnam.cont

continous variables include in the model

xnam.cont.gam

continous variables to be included in the smoothing operator gam::s(,df=)

ens.library

algorithms in the library

par

a vector of weights. Its length must be equal to the number of predictions included in Z

Z

a matrix that contains the predictions. Each column represents a single marker.

y

vector of response variable (binary).

wts

IPC weights

folds

number of folds

fmla

formula object ex. "E ~ x1+x2"

tao

time point of interest

data

a training data set

data

A data frame that contains at least: ttilde, delta, wts

Z

a matrix that contains the predictions. Each column represents a single marker.

tao

time point of interest

par

a vector of coefficients/weights. Its length must be equal to the number of predictions included in Z

lambda

penalization term. It is a positive scalar.

Z

a matrix that contains the predictions. Each column represents a single marker.

data

A data frame that constains at least: ttilde= time to event, delta=event type, wts= IPC weights

tao

time point of interest

traindata

training data set

testdata

validation/test data set

fmla

formula object ex. "E ~ x1+x2"

tuneparams

a list of tune parameters for each machine learning procedure

traindata

training data set

testdata

validation/test data set

fmla

formula object ex. "E ~ x1+x2"

tuneparams

a list of tune parameters for each machine learning procedure

xnam

a vector with the covariates names considered in the modeling

data

a training data set

tao

time point of interest

Format

An object of class list of length 8.

Details

These functions are not intended for use by users.

Value

a list with the predictions of each machine learning algorithm (id, predictions), the average AUC across folds for each of them, the optimal coefficients, an indicator if the optimization procedure has converged and the value of penalization term chosen

a matrix with the predictions on the test data set of each machine learning algorithm considered in MLprocedures

lambda to be used in the glmnet function

a vector with the optimal AUC value and the optimal coefficient

1-AUC

a matrix of predictions where each column is the prediction of each algorithm based on the testdata

a list of Machine Learning functions

a matrix of predictions where each column is the prediction of each algorithm based on the testdata

a list of Machine Learning functions

a list with a grid of values for each hyperparameter gam_param a vector containing degree of freedom 3 and 4 lasso_param a grid of values for the shrinkage term lambda randomforest_param a two column matrix: first column denotes the num_trees parameter and the second column denotes the mtry parameter. knn_param a grid of positive integers values svm_param a three column matrix: first column denotes the cost parameter, second column the gamma and third column the kernel. kernel=1 denotes "radial" and kernel=2 denotes "linear". nn_param a grid of positive integers values for the neurons bart_param a three column matrix: first column denotes the num_tree parameter, second column the k parameter and third column the q parameter.


pablogonzalezginestet/EnsBagg documentation built on Aug. 25, 2023, 3:22 a.m.