Description Usage Arguments Details Value Note Author(s) References See Also Examples
Performs two-sample nonparametric multivariate test of variance based on the minimum spanning tree (MST). It calculates the mean deviation between the cumulative distribution functions (CDFs) of sample ranks in two conditions. It tests the null hypothesis that a set of features has the same variance (scale) in two conditions versus different variances.
1 |
object |
a numeric matrix with columns and rows respectively corresponding to samples and features. |
group |
a numeric vector indicating group associations for samples. Possible values are 1 and 2. |
mst.order |
numeric value to indicate the consideration of the union
of the first |
nperm |
number of permutations used to estimate the null distribution of the test statistic. If not given, a default value 1000 is used. |
pvalue.only |
logical. If |
This function tests the null hypothesis that a set of features has
the same scale in two conditions. It performs a two-sample nonparametric
multivariate test based on the minimum spanning tree (MST) as proposed by
Friedman and Rafsky (1979). The MST of the weighted undirectional graph
created from the samples is found. The nodes of the MST are ranked based on
their position in the MST. The MST is rooted at the node with smallest geodisic
distance (rank 1) and nodes with higher depths from the root are assigned
higher ranks. The mean deviation between the cumulative distribution functions
(CDFs) of sample ranks in two conditions is calculated. The null distribution
of the test statistic is estimated by permuting sample labels nperm
times and calculating the test statistic for each. P-value is calculated as
p.value = \frac{∑_{k=1}^{nperm} I ≤ft[ |D_{k}| ≥q |D_{obs}| \right] + 1}{nperm + 1}
where D_{k} is the test statistic for permutation k
, D_{obs} is the
observed test statistic, and I
is the indicator function.
When pvalue.only=TRUE
(default), function RMDtest
returns
the p-value indicating the attained significance level. When
pvalue.only=FALSE
, function RMDtest
produces a list of
length 3 with the following components:
statistic |
the value of the observed test statistic. |
perm.stat |
numeric vector of the resulting test statistic for
|
p.value |
p-value indicating the attained significance level. |
The variance of both the Poisson and negative Bionomial distributions, used
to model count data, is a function of their mean. Therefore, using the radial
mean deviation test (RMDtest
) to detect pathways with differential
variance for RNA-Seq counts is not recommended without proper data
normalization.
Yasir Rahmatallah and Galina Glazko
Rahmatallah Y., Emmert-Streib F. and Glazko G. (2012) Gene set analysis for self-contained tests: complex null and specific alternative hypotheses. Bioinformatics 28, 3073–3080.
Friedman J. and Rafsky L. (1979) Multivariate generalization of the Wald-Wolfowitz and Smirnov two-sample tests. Ann. Stat. 7, 697–717.
RKStest
, WWtest
, MDtest
,
KStest
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ## generate a feature set of length 20 in two conditions
## each condition has 20 samples
## use multivariate normal distribution
library(MASS)
ngenes <- 20
nsamples <- 40
## let the mean vector have zeros of length 20 for both conditions
zero_vector <- array(0,c(1,ngenes))
## set the covariance matrix to be an identity matrix for condition 1
cov_mtrx <- diag(ngenes)
gp1 <- mvrnorm((nsamples/2), zero_vector, cov_mtrx)
## set some scale difference in the covariance matrix for condition 2
cov_mtrx <- cov_mtrx*3
gp2 <- mvrnorm((nsamples/2), zero_vector, cov_mtrx)
## combine the data of two conditions into one dataset
gp <- rbind(gp1,gp2)
dataset <- aperm(gp, c(2,1))
## first 20 samples belong to group 1
## second 20 samples belong to group 2
pvalue <- RMDtest(object=dataset, group=c(rep(1,20),rep(2,20)))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.