Cross-validation for glmgraph

Share:

Description

Performs k-fold cross validation for glmgraph

Usage

1
cv.glmgraph(X,Y,L,...,type.measure=c("mse","mae","deviance","auc"),nfolds=5,trace=TRUE)

Arguments

X

X matrix as in glmgraph.

Y

Response Y as in glmgraph.

L

User-specified Laplacian matrix L as in glmgraph.

...

Additional arguments as in glmgraph.

type.measure

if family is "gaussian", the type.measure option is "mse"(mean squared error) or "mae"(mean absolute error); if family is "binomial", the type.measure option is "deviance" or "auc"(area under the curve). The default is "mse".

nfolds

The number of cross-validation folds. Default is 5.

trace

Print out the cross validation steps if trace is specified TRUE.

Details

The function runs glmgraph nfolds+1 times; the first to get the lambda1 and lambda2 sequence, and then the remainder to compute the fit with each of the folds omitted. The error is accumulated, and the average error and standard deviation over the folds is computed. Note also that the results of cv.glmgraph are random, since the folds are selected at random. Users can reduce this randomness by running cv.glmgraph many times, and averaging the error curves.

Value

An object "cv.glmgraph" containing:

obj

The fitted glmgraph object for the whole data.

cvmat

A data frame summarized cross validation results, which could be obtained by print function. It has lambda2,lambda1.min,cvmin,semin,lambda1.1se as columns. Each row represents that for this lambda2, lambda1 with best type.measure cvmin is chosen and reported as lambda1.min. If one standard error rule is applied, lambda1.1se and its corresponding best type.measure value semin is reported.

cvm

The mean cross-validated type.measure value. A list of vector contains type.measure. Each element of the list is a vector that is type.measure value for one lambda2 across all lambda1 sequence averaged across K-fold.

cvsd

The estimate of standard error of cvm.

cvmin

Best cross-validation type.measure value across all combination of lambda1 and lambda2. It is minimum "mse" or "mae" if family is "gaussian"; it is the maximum "auc" or minimum "deviance" if family is "binomial".

cv.1se

Simliar to cvmin except one standard error rule is applied.

lambda1.min

Coupled with lambda2.min is the optimal regularization parameter selection.

lambda2.min

Coupled with lambda1.min is the optimal regularization parameter selection.

lambda1.1se

Coupled with lambda2.1se is the optimal regularization parameter selection if one standard error rule is applied.

lambda2.1se

Coupled with lambda1.1se is the optimal regularization parameter selection if one standard error rule is applied.

beta.min

Estimated beta with best type.measure value with the regularization parameter of lambda1.min and lambda2.min.

beta.1se

Estimated beta with best type.measure value with the regularization parameter of lambda1.1se and lambda2.1se.

Author(s)

Li Chen <li.chen@emory.edu> , Jun Chen <chen.jun2@mayo.edu>

References

Li Chen. Han Liu. Hongzhe Li. Jun Chen(2015) glmgraph: Graph-constrained Regularization for Sparse Generalized Linear Models.(Working paper)

See Also

glmgraph,coef.cv.glmgraph,predict.cv.glmgraph

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
 set.seed(1234)
 library(glmgraph)
 n <- 100
 p1 <- 10
 p2 <- 90
 p <- p1+p2
 X <- matrix(rnorm(n*p), n,p)
 magnitude <- 1
 ## construct laplacian matrix from adjacency matrix
 A <- matrix(rep(0,p*p),p,p)
 A[1:p1,1:p1] <- 1
 A[(p1+1):p,(p1+1):p] <- 1
 diag(A) <- 0
 diagL <- apply(A,1,sum)
 L <- -A
 diag(L) <- diagL
 btrue <- c(rep(magnitude,p1),rep(0,p2))
 intercept <- 0
 eta <- intercept+X%*%btrue
 ### gaussian
 Y <- eta+rnorm(n)
 cv.obj <- cv.glmgraph(X,Y,L,penalty="lasso",lambda2=c(0,1.28))
 beta.min <- coef(cv.obj)
 print(cv.obj)
 ### binomial
 Y <- rbinom(n,1,prob=1/(1+exp(-eta)))
 cv.obj <- cv.glmgraph(X,Y,L,family="binomial",lambda2=c(0,1.28),penalty="lasso",type.measure="auc")
 beta.min <- coef(cv.obj)
 print(cv.obj)