uni.selection: Univariate feature selection based on univariate significance...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/uni.selection.R

Description

This function perform univariate feature selection using univariate significance tests, where the Wald statistics or score statistics are used to measure significance. Features are selected according to whether their P-values are less than a given threshold by user. The cross-validated likelihood (CVL) value is computed for selected features (Matsui 2006; Emura et al. 2018).

Usage

1
2
uni.selection(t.vec, d.vec, X.mat, P.value = 0.001, K = 10,
score=FALSE,d0=0,randomize=FALSE,c.plot=TRUE,permutation=FALSE)

Arguments

t.vec

Vector of survival times (time to either death or censoring)

d.vec

Vector of censoring indicators, 1=death, 0=censoring

X.mat

n by p matrix of covariates, where n is the sample size and p is the number of covariates

P.value

A threshold for selecting features

K

The number of cross-validation folds

score

If TRUE, the score test is performed instead of the Wald test

d0

A positive constant to stabilize the variance (Witten & Tibshirani 2010)

randomize

If TRUE, randomize patient ID's before cross-validation

c.plot

If TRUE, the plot of c-index is displayed

permutation

If TRUE, the FDR is computed by randomly permutating the gene expressions

Details

Predictive ability of the selected genes are evaluated throught cross-validated log-likelihood (CVL) and c-index are computed.

Value

gene

Gene symbols

beta

Estimated regression coefficients

Z

Z-value for testing H_0: beta=0 (Wald test)

P

P-value for testing H_0: beta=0 (Wald test)

c_index

c-index

CVL

Cross-validated partial likelihood

Genes

The number of genes, the number of selected genes, and the number of falsely selected genes

FDR

False Discovery Rate

Author(s)

Takeshi Emura

References

Matsui S (2006). Predicting Survival Outcomes Using Subsets of Significant Genes in Prognostic Marker Studies with Microarrays. BMC Bioinformatics: 7:156.

Emura T, Chen YH (2016). Gene Selection for Survival Data Under Dependent Censoring: a Copula-based Approach, Stat Methods Med Res 25(No.6): 2840-57

Witten DM, Tibshirani R (2010) Survival analysis with high-dimensional covariates. Stat Method Med Res 19:29-51

Examples

1
2
3
4
5
6
data(Lung)
t.vec=Lung$t.vec[Lung$train==TRUE]
d.vec=Lung$d.vec[Lung$train==TRUE]
X.mat=Lung[Lung$train==TRUE,-c(1,2,3)]
uni.selection(t.vec, d.vec, X.mat, P.value=0.05,K=5)
## the outputs reproduce Table 3 of Emura and Chen (2016) ## 

compound.Cox documentation built on May 24, 2018, 5:03 p.m.