mlikCV: Outer-loop cross-validation for estimating performance of...

View source: R/MultiLambdaCVfun.R

mlikCVR Documentation

Outer-loop cross-validation for estimating performance of marginal likelihood based multiridge

Description

Outer-loop cross-validation for estimating performance of marginal likelihood based multiridge. Outer fold is for testing; penalty parameter tuning is performed by marginal likelihood estimation

Usage

mlikCV(penaltiesinit, XXblocks, Y, pairing = NULL, outfold = 5, nrepeatout = 1,
balance = TRUE,fixedfolds = TRUE,  model = NULL, intercept =
ifelse(is(Y, "Surv"), FALSE, TRUE), reltol = 1e-04, trace = FALSE, optmethod1 = "SANN",
optmethod2 = ifelse(length(penaltiesinit) == 1, "Brent", "Nelder-Mead"),
maxItropt1 = 10, maxItropt2 = 25, parallel = FALSE, pref = NULL,
fixedpen = NULL, sigmasq = 1, opt.sigma=ifelse(model=="linear",TRUE, FALSE))

Arguments

penaltiesinit

Numeric vector. Initial values for penaltyparameters. May be obtained from fastCV2.

XXblocks

List of nxn matrices. Usually output of createXXblocks.

Y

Response vector: numeric, binary, factor or survival.

pairing

Numerical vector of length 3 or NULL when pairs are absent. Represents the indices (in XXblocks) of the two data blocks involved in pairing, plus the index of the paired block.

outfold

Integer. Outer fold for test samples.

nrepeatout

Integer. Number of repeated splits for outer fold.

balance

Boolean. Should the splits be balanced in terms of response labels?

fixedfolds

Boolean. Should fixed splits be used for reproducibility?

intercept

Boolean. Should an intercept be included?

model

Character. Any of c("linear", "logistic", "cox"). Is inferred from Y when NULL.

trace

Boolean. Should the output of the IWLS algorithm be traced?

reltol

Scalar. Relative tolerance for optimization methods.

optmethod1

Character. First, global search method. Any of the methods c("Brent", "Nelder-Mead", "Sann") may be used, but simulated annealing by "Sann" is recommended to search a wide landscape. Other unconstrained methods offered by optim may also be used, but have not been tested.

optmethod2

Character. Second, local search method. Any of the methods c("Brent", "Nelder-Mead", "Sann") may be used, but "Nelder-Mead" is generally recommended. Other unconstrained methods offered by optim may also be used, but have not been tested.

maxItropt1

Integer. Maximum number of iterations for optmethod1.

maxItropt2

Integer. Maximum number of iterations for optmethod2.

parallel

Boolean. Should computation be done in parallel? If TRUE, requires to run setupParallel first.

pref

Integer vector or NULL. Contains indices of data types in XXblocks that are preferential.

fixedpen

Integer vector or NULL. Contains indices of data types of which penalty is fixed to the corresponding value in penaltiesinit.

sigmasq

Default error variance.

opt.sigma

Boolean. Should the error variance be optimized as well? Only relevant for model="linear".

Details

WARNING: this function may be very time-consuming. The number of evaluations may equal nrepeatout*outerfold*(maxItropt1+maxItropt2). Computing time may be estimated by multiplying computing time of optLambdas_mgcvWrap by nrepeatout*outerfold.

Value

List with the following components:

sampleindex

Numerical vector: sample indices

true

True responses

linpred

Cross-validated linear predictors

See Also

optLambdas_mgcv, optLambdas_mgcvWrap which optimize the penalties. Scoring which may applied to output of this function to obtain overall cross-validated performance score. doubleCV for double cross-validation counterpart. A full demo and data are available from:
https://drive.google.com/open?id=1NUfeOtN8-KZ8A2HZzveG506nBwgW64e4

Examples

data(dataXXmirmeth)
resp <- dataXXmirmeth[[1]]
XXmirmeth <- dataXXmirmeth[[2]]

# Find initial lambdas: fast CV per data block separately.
cvperblock2 <- fastCV2(XXblocks=XXmirmeth,Y=resp,kfold=10,fixedfolds = TRUE)
lambdas <- cvperblock2$lambdas

# Outer cross-validation, inner marginal likelihood optimization
## Not run: 
perfmlik <- mlikCV(penaltiesinit=lambdas,XXblocks=XXmirmeth,Y=resp,outfold=10,
nrepeatout=1)


# Performance metrics
Scoring(perfmlik$linpred,perfmlik$true,score="auc",print=TRUE)
Scoring(perfmlik$linpred,perfmlik$true,score="brier",print=TRUE)
Scoring(perfmlik$linpred,perfmlik$true,score="loglik",print=TRUE)

## End(Not run)

multiridge documentation built on June 13, 2022, 5:07 p.m.