hsic.var.rcv: Error Variance Estimation in Genomic Prediction

Description Usage Arguments Details Value Author(s) References Examples

View source: R/hsic.var.rcv.R

Description

Estimation of error variance using Refitted Cross Validation in HSIC LASSO.

Usage

1
hsic.var.rcv(x,y,d)

Arguments

x

a matrix of markers or explanatory variables, each column contains one marker and each row represents an individual.

y

a column vector of response variable.

d

number of variables to be selected from x.

Details

Refitted cross validation method (RCV) which is a two step method, is used to get the estimate of the error variance. In first step, dataset is divided into two sub-datasets and with the help of HSIC LASSO most significant markers(variables) are selected from the two sub-datasets. This results in two small sets of selected variables. Then using the set selected from 1st sub-dataset error variance is estimated from the 2nd sub-dataset with ordinary least square method and using the set selected from the 2nd sub-dataset error variance is estimated from the 1st sub-dataset with ordinary least square method. Finally the average of those two error variances are taken as the final estimator of error variance with RCV method.

Value

Error variance

Author(s)

Sayanti Guha Majumdar <sayanti23gm@gmail.com>, Anil Rai, Dwijesh Chandra Mishra

References

Fan, J., Guo, S., Hao, N. (2012). Variance estimation using refitted cross-validation in ultrahigh dimensional regression. Journal of the Royal Statistical Society, 74(1), 37-65.
Yamada, M., Jitkrittum, W., Sigal, L., Xing, E. P. and Sugiyama, M. (2014). High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso. Neural Computation, 26(1):185-207. doi:10.1162/NECO_a_00537

Examples

1
2
3
4
5
6
7
library(GSelection)
data(GS)
x_trn <- GS[1:40,1:110]
y_trn <- GS[1:40,111]
x_tst <- GS[41:60,1:110]
y_tst <- GS[41:60,111]
hsic_var <- hsic.var.rcv(x_trn,y_trn,10)

GSelection documentation built on Nov. 4, 2019, 5:06 p.m.