Description Usage Arguments Details Value Author(s) References Examples

View source: R/uni.selection.R

This function performs univariate feature selection using significance tests (Wald tests or score tests) based on association between individual features and survival. Features are selected if their P-values are less than a given threshold (P.value).

1 2 | ```
uni.selection(t.vec, d.vec, X.mat, P.value=0.001,K=10,score=TRUE,d0=0,
randomize=FALSE,CC.plot=FALSE,permutation=FALSE,M=200)
``` |

`t.vec` |
Vector of survival times (time to either death or censoring) |

`d.vec` |
Vector of censoring indicators (1=death, 0=censoring) |

`X.mat` |
n by p matrix of covariates, where n is the sample size and p is the number of covariates |

`P.value` |
A threshold for selecting features |

`K` |
The number of cross-validation folds |

`score` |
If TRUE, the score tests are used; if not, the Wald tests are used |

`d0` |
A positive constant to stabilize the variance of score statistics (Witten & Tibshirani 2010) |

`randomize` |
If TRUE, randomize patient ID's before cross-validation |

`CC.plot` |
If TRUE, the compound covariate (CC) predictors are plotted |

`permutation` |
If TRUE, the FDR is computed by a permutation method (Witten & Tibshirani 2010; Emura et al. 2018). |

`M` |
The number of permutations to calculate the FDR |

The cross-validated likelihood (CVL) value is computed for selected features (Matsui 2006; Emura et al. 2018-). A high CVL value corresponds to a better predictive ability of selected features. Hence, the CVL value can be used to find the optimal set of features. The CVL value is computed by a K-fold cross-validation, where the number K can be chosen by user. The false discovery rate (FDR) is also computed by a formula and a permutation test (if "permutation=TRUE"). The RCVL1 and RCVL2 are "re-substitution" CVL values and provide upper control limits for the CVL value. If the CVL value is less than RCVL1 and RCVL2 values, the CVL value would be in-control. On the other hand, if the CVL value exceeds either RCVL1 or RCVL2 value, then the CVL may be computed again after changing the sample allocation.

`gene ` |
Gene symbols |

`beta ` |
Estimated regression coefficients |

`Z ` |
Z-values for significance tests |

`P ` |
P-values for significance tests |

`CVL ` |
The value of CVL, RCVL1, and RCVL2 (Emura et al. 2018-) |

`Genes ` |
The number of genes, the number of selected genes, and the number of falsely selected genes |

`FDR ` |
False discovery rate (by a formula or a permutation method) |

Takeshi Emura

Emura T, Matsui S, Chen HY (2018-). compound.Cox: Univariate Feature Selection and Compound Covariate for Predicting Survival, Computer Methods and Programs in Biomedicine, to appear.

Matsui S (2006). Predicting Survival Outcomes Using Subsets of Significant Genes in Prognostic Marker Studies with Microarrays. BMC Bioinformatics: 7:156.

Witten DM, Tibshirani R (2010) Survival analysis with high-dimensional covariates. Stat Method Med Res 19:29-51

1 2 3 4 5 6 |

```
Loading required package: numDeriv
Loading required package: survival
$beta
ANXA5 DLG2 ZNF264 DUSP6 CPEB4 LCK STAT1
-1.0876762 1.3215044 0.5473276 0.7524497 0.5891676 -0.8447389 -0.5844262
RNF4 IRF4 STAT2 HGF ERBB3 NF1 FRAP1
0.6463635 0.5176704 0.5849869 0.5086750 0.5509026 0.4715235 -0.7696768
MMD HMMR
0.9151541 0.5156711
$Z
ANXA5 DLG2 ZNF264 DUSP6 CPEB4 LCK STAT1 RNF4
-2.885540 2.872880 2.654412 2.628478 2.404015 -2.384028 -2.329287 2.290596
IRF4 STAT2 HGF ERBB3 NF1 FRAP1 MMD HMMR
2.171948 2.155568 2.127643 2.126139 2.074913 -2.045298 2.034407 1.976606
$P
ANXA5 DLG2 ZNF264 DUSP6 CPEB4 LCK
0.003907424 0.004067486 0.007944666 0.008576790 0.016216117 0.017124302
STAT1 RNF4 IRF4 STAT2 HGF ERBB3
0.019843870 0.021986777 0.029859561 0.031117422 0.033366690 0.033491656
NF1 FRAP1 MMD HMMR
0.037994593 0.040825466 0.041910555 0.048086199
$CVL
CVL RCVL1 RCVL2
-96.00449 -83.71309 -85.32446
$Genes
No. of genes No. of selected genes
97 16
$FDR
P.value * (No. of genes)
0.303125
Warning messages:
1: In fitter(X, Y, strats, offset, init, control, weights = weights, :
Loglik converged before variable 1 ; beta may be infinite.
2: In fitter(X, Y, strats, offset, init, control, weights = weights, :
Loglik converged before variable 1 ; beta may be infinite.
3: In fitter(X, Y, strats, offset, init, control, weights = weights, :
Loglik converged before variable 1 ; beta may be infinite.
```

compound.Cox documentation built on July 21, 2018, 5:01 p.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.