AggrFtest: Aggregated F-Test of Variance Using Fisher's Probability...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/AggrFtest.R

Description

Performs two-sample nonparametric test of variance. The univariate F-test is used for every gene in the gene set and the resulted p-values are aggregated together using Fisher's probability combining method and used as the test statistic. The null distribution of the test statistic is estimated by permuting sample labels and calculating the test statistic for a large number of times. This statistic tests the null hypothesis that none of the genes shows significant difference in variance between two conditions against the alternative hypothesis that at least one gene shows significant difference in variance between two conditions according to the F-test.

Usage

1
AggrFtest(object, group, nperm=1000, pvalue.only=TRUE)

Arguments

object

a numeric matrix with columns and rows respectively corresponding to samples and features (genes).

group

a numeric vector indicating group associations for samples. Possible values are 1 and 2.

nperm

a numeric value indicating the number of permutations used to estimate the null distribution of the test statistic. If not given, a default value 1000 is used.

pvalue.only

logical. If TRUE (default), the p-value is returned. If FALSE a list of length three containing the observed statistic, the vector of permuted statistics, and the p-value is returned.

Details

This function tests the null hypothesis that none of the genes in a gene set shows a significant difference in variance between two conditions according to the F-test against the alternative hypothesis that at least one gene shows significant difference in variance according to the F-test. It performs a two-sample nonparametric test of variance by using the univariate F-test for every gene in a set, adjust for multiple testing using the Benjamini and Hochberg method (also known as FDR) as shown in Benjamini and Hochberg (1995), and then aggregates the obtained adjusted p-values using Fisher's probability combining method to get a test statistic (T) for the gene set

T = -2 ∑_{i=1}^{p} \log_{e} (p_{i})

where p_{i} is the adjusted p-value of the univariate F-test for gene i. The null distribution of the test statistic is estimated by permuting sample labels nperm times and calculating the test statistic T for each. P-value is calculated as

p.value = \frac{∑_{k=1}^{nperm} I ≤ft[ T_{k} ≥q T_{obs} \right] + 1}{nperm + 1}

where T_{k} is the test statistic for permutation k, T_{obs} is the observed test statistic, and I is the indicator function.

Value

When pvalue.only=TRUE (default), function AggrFtest returns the p-value indicating the attained significance level. When pvalue.only=FALSE, function AggrFtest produces a list of length 3 with the following components:

statistic

the value of the observed test statistic.

perm.stat

numeric vector of the resulting test statistic for nperm random permutations of sample labels.

p.value

p-value indicating the attained significance level.

Author(s)

Yasir Rahmatallah and Galina Glazko

References

Benjamini Y. and Hochberg Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B 57, 289–300.

See Also

RKStest, RMDtest.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## generate a feature set of length 20 in two conditions
## each condition has 20 samples
## use multivariate normal distribution
library(MASS)
ngenes <- 20
nsamples <- 40
## let the mean vector have zeros of length 20 for both conditions
zero_vector <- array(0,c(1,ngenes))
## set the covariance matrix to be an identity matrix for condition 1
cov_mtrx <- diag(ngenes)
gp1 <- mvrnorm((nsamples/2), zero_vector, cov_mtrx)
## set some scale difference in the covariance matrix for condition 2
cov_mtrx <- cov_mtrx*3
gp2 <- mvrnorm((nsamples/2), zero_vector, cov_mtrx)
## combine the data of two conditions into one dataset
gp <- rbind(gp1,gp2)
dataset <- aperm(gp, c(2,1))
## first 20 samples belong to group 1
## second 20 samples belong to group 2
pvalue <- AggrFtest(object=dataset, group=c(rep(1,20),rep(2,20)))

GSAR documentation built on Nov. 8, 2020, 7:16 p.m.