cv.ecpc: Cross-validation for 'ecpc'

Description Usage Arguments Value See Also Examples

View source: R/ecpc.R

Description

Cross-validates 'ecpc' and returns model fit, summary statistics and cross-validated performance measures.

Usage

1
2
cv.ecpc(Y,X,type.measure=c("MSE","AUC"),outerfolds=10,
        lambdas=NULL,ncores=1,balance=TRUE,silent=FALSE,...)

Arguments

Y

Response data; n-dimensional vector (n: number of samples) for linear and logistic outcomes, or Surv object for Cox survival.

X

Observed data; (nxp)-dimensional matrix (p: number of covariates) with each row the observed high-dimensional feature vector of a sample.

type.measure

Type of cross-validated performance measure returned.

outerfolds

Number of cross-validation folds.

lambdas

A vector of global ridge penalties for each fold; may be given, else estimated.

ncores

Number of cores; if larger than 1, the outer cross-validation folds are processed in parallel over 'ncores' clusters.

balance

(logistic, Cox) Should folds be balanced in response?

silent

Should output messages be suppressed (default FALSE)?

...

Additional arguments used in ecpc.

Value

A list with the following elements:

ecpc.fit

List with the ecpc model fit in each fold.

dfPred

Data frame with information about out-of-bag predictions.

dfGrps

Data frame with information about estimated group and group set weights across folds.

dfCVM

Data frame with cross-validated performance metric.

See Also

Visualise cross-validated group set weights with visualiseGroupsetweights or group weights with visualiseGroupweights.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#####################
# Simulate toy data #
#####################
p<-300 #number of covariates
n<-100 #sample size training data set
n2<-100 #sample size test data set

#simulate all betas i.i.d. from beta_k~N(mean=0,sd=sqrt(0.1)):
muBeta<-0 #prior mean
varBeta<-0.1 #prior variance
indT1<-rep(1,p) #vector with group numbers all 1 (all simulated from same normal distribution)

#simulate test and training data sets:
Dat<-simDat(n,p,n2,muBeta,varBeta,indT1,sigma=1,model='linear') 
str(Dat) #Dat contains centered observed data, response data and regression coefficients

##########################
# Make co-data group sets #
##########################
#Group set: G random groups
G <- 5 #number of groups
#sample random categorical co-data:
categoricalRandom <- as.factor(sample(1:G,p,TRUE))
#make group set, i.e. list with G groups:
groupsetRandom <- createGroupset(categoricalRandom)

#######################
# Cross-validate ecpc #
#######################
tic<-proc.time()[[3]]
cv.fit <- cv.ecpc(type.measure="MSE",outerfolds=2,
                  Y=Dat$Y,X=Dat$Xctd,
                  groupsets=list(groupsetRandom),
                  groupsets.grouplvl=list(NULL),
                  hypershrinkage=c("none"),
                  model="linear",maxsel=c(5,10,15,20))
toc <- proc.time()[[3]]-tic

str(cv.fit$ecpc.fit) #list containing the model fits on the folds
str(cv.fit$dfPred) #data frame containing information on the predictions
cv.fit$dfCVM #data frame with the cross-validated performance for ecpc
#with/without posterior selection and ordinary ridge

ecpc documentation built on May 3, 2021, 9:08 a.m.