RepeatRanking: Repeat the ranking procedure for altered data sets

Description Usage Arguments Value Author(s) See Also Examples

Description

Altered data sets are typically prepared by calls to GenerateFoldMatrix or GenerateBootMatrix. The ranking procedure is then repeated for each of these new 'artificial' data sets. One major goal of this procedure is to examine the stability of the results obtained with the original dataset.

Usage

1
2
RepeatRanking(R, P, scheme=c("subsampling", "labelexchange"), iter=10,
                              varlist = list(genewise=FALSE, factor=1/5), ...)

Arguments

R

The original ranking, represented by an object of class GeneRanking.

P

An object of class FoldMatrix or BootMatrix as generated by GenerateFoldMatrix or GenerateBootMatrix, respectively.
Can also be missing. In this case, the original dataset is perturbed by adding gaussian noise, s. argument varlist.

scheme

Used only if P is a Foldmatrix. Can be "subsampling" or "labelexchange". 'Subsampling' means that observations are removed as determined by the slot foldmatrix. 'Labelexchange' means that those observations which would be removed are instead kept in the sample, but are assigned to the opposite class.

iter

Used only if P is missing, specifying the number of different noise-perturbed datasets to be created. Per default, the number of iterations is 10.

varlist

Used only if P is missing. A list with two components (genewise, a logical and frac, a positive real number), both controlling the variance of the added noise. If genewise=FALSE (default) then the noise has the same variance for all genes: it is estimated by pooled variance estimation from the original data set. Otherwise, the variance of the noise is different for each gene and estimated genewise from the original data set. frac is the fraction of the variance of the estimated variance(s) to be used as the variance of the added noise. The default value is 1/5 and is usually smaller than 1.

...

Further arguments to be passed to the ranking method from which rankings are generated.

Value

An object of class RepeatedRanking

Author(s)

Martin Slawski
Anne-Laure Boulesteix

See Also

GeneRanking, RepeatedRanking, RankingTstat, RankingFC, RankingWelchT, RankingWilcoxon, RankingBaldiLong, RankingFoxDimmic, RankingLimma, RankingEbam, RankingWilcEbam, RankingSam, RankingShrinkageT, RankingSoftthresholdT, RankingPermutation

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
 ## Load toy gene expression data
data(toydata)
### class labels
yy <- toydata[1,]
### gene expression
xx <- toydata[-1,]
### Get ranking for the original data set, with the ordinary t-statistic
ordT <- RankingTstat(xx, yy, type="unpaired")
### Generate the leave-one-out / exchange-one-label matrix
loo <- GenerateFoldMatrix(y = yy, k=1)
### Repeat the ranking with the t-statistic, using the leave-one-out scheme
loor_ordT <- RepeatRanking(ordT, loo)
### .. or the label exchange scheme
ex1r_ordT <- RepeatRanking(ordT, loo, scheme = "labelexchange")
### Generate the bootstrap matrix
boot <- GenerateBootMatrix(y = yy, maxties=3, minclassize=5, repl=30)
### Repeat ranking with the t-statistic for bootstrap replicates
boot_ordT <- RepeatRanking(ordT, boot)
### Repeat the ranking procedure for an altered data set with added noise
noise_ordT <- RepeatRanking(ordT, varlist=list(genewise=TRUE, factor=1/10))

GeneSelector documentation built on May 1, 2019, 11:35 p.m.