krcv: Variance Estimation with kfold-RCV

Description Usage Arguments Details Value Author(s) References Examples

View source: R/krcv.R

Description

Estimation of error variance using k-fold refitted cross validation in ultrahigh dimensional dataset.

Usage

1
krcv(x,y,a,k,d,method=c("spam","lasso","lsr"))

Arguments

x

a matrix of markers or explanatory variables, each column contains one marker and each row represents an individual.

y

a column vector of response variable.

a

value of alpha, range is 0<=a<=1 where, a=1 is LASSO penalty and a=0 is Ridge penalty.If variable selection method is LASSO then providing value to a is compulsory. For other methods a should be NULL.

k

dataset is divided into this many numbers of sub-datasets.

d

number of variables to be selected from x.

method

variable selection method, user can choose any method among "spam", "lasso", "lsr"

Details

The error variance is estimated from a high dimensional datasets where number of parameters are more than number of individuals, i.e. p > n.k-fold RCV is an extended version of original RCV method (Fan et al., 2012). In this case the datasets are divided into k equal size groups instead of 2 groups. Variables are selected using Sparse Additive Models (SpAM) or LASSO or least squared regression (lsr) from one group and variance is estimated using selected variables with ordinary least squared estimation from rest of the k-1 groups. Likewise, all the groups are covered and in the end, average value of all the variances from each group is the final error variance.

Value

Error variance

Author(s)

Sayanti Guha Majumdar <sayanti23gm@gmail.com>, Anil Rai, Dwijesh Chandra Mishra

References

Fan, J., Guo, S., Hao, N. (2012).Variance estimation using refitted cross-validation in ultrahigh dimensional regression. Journal of the Royal Statistical Society, 74(1), 37-65
Ravikumar, P., Lafferty, J., Liu, H. and Wasserman, L. (2009). Sparse additive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(5), 1009-1030
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of Royal Statistical Society, 58, 267-288

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## data simulation
marker <- as.data.frame(matrix(NA, ncol =500, nrow = 200))
for(i in 1:500){
marker[i] <- sample(1:3, 200, replace = TRUE, prob = c(1, 2, 1))
}
pheno <- marker[,1]*1.41+marker[,2]*1.41+marker[,3]*1.41+marker[,4]*1.41+marker[,5]*1.41

pheno <- as.matrix(pheno)
marker<- as.matrix(marker)

## estimation of error variance
var <- krcv(marker,pheno,1,4,5,"spam")

varEst documentation built on Sept. 23, 2019, 5:04 p.m.