wilcox.split: Wilcoxon rank sum statistic in cross-validation (CV) and...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

The function wilcox.split computes the Wilcoxon rank sum statistic for all niter CV or MCCV iterations defined by the matrix split.

Usage

1
wilcox.split(x,y,split,algo="new")

Arguments

x

a numeric vector of length n giving the expression levels of a gene for the n arrays.

y

a vector of length n giving the class membership for the n arrays. y can be either a factor or a numeric and must be coded as 0,1.

split

A niter x ntest matrix giving the indices of the ntest observations included in each of the niter test sets, as generated by the functions generate.split or generate.cv. The i-th row of split gives the indices of the observations included in the test data set for the i-th iteration.

algo

either "new" or "naive". If algo="new", the new fast method described in Boulesteix (2007) is used to compute the Wilcoxon rank statistic. If algo="naive", the Wilcoxon rank sum statistics are obtained by running the function wilcox.test niter times.

Details

The Wilcoxon rank sum statistic is defined as the sum of the X-ranks of the observations with y=0. The Wilcoxon rank sum test is equivalent to the Mann-Whitney test. It is implemented in the function wilcox.test.

In the context of cross-validation (CV) or Monte-Carlo cross-validation (MCCV), wilcox.selection.split computes the Wilcoxon rank sum statistic for each iteration. At each iteration, a subset of the n observations is excluded from the data set and considered as test data set. The indices of the observations considered as test set for each of the niter iterations are given in the niter x ntest matrix split.

Value

A list with the following components:

wilcox.split

a numeric vector of length niter whose i-th component gives the Wilcoxon rank sum statistic obtained in the i-th iteration.

Author(s)

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/index.html)

References

A. L. Boulesteix (2007). WilcoxCV: an R package for fast variable selection in cross-validation. Bioinformatics 23:1702-1704.

See Also

wilcox.test, generate.split, generate.cv, wilcox.selection.split

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# load WilcoxCV library
library(WilcoxCV)

# Generate data
x<-rnorm(100)
y<-sample(c(0,1),100,replace=TRUE)

# Generate 50 MCCV splits with ratio 2:1 for a data set including 90 observations
my.split<-generate.split(niter=50,n=90,ntest=30)

# Compute the Wilcoxon rank sum statistic for the 50 iterations.
wilcox.split(x=x,y=y,split=my.split,algo="new")

Example output

 [1] 1278 1233 1427 1217 1115 1353 1357 1183 1178 1294 1289 1131 1046 1167 1208
[16] 1212 1248 1360 1238 1338 1219 1187 1241 1417 1217 1113 1126 1246 1291 1280
[31] 1425 1267 1176 1352 1335 1164 1184 1271 1318 1047 1186 1072 1152 1168 1198
[46] 1211 1238 1320 1347 1313

WilcoxCV documentation built on May 2, 2019, 4:16 a.m.