Description Usage Arguments Details Value See Also Examples
Higher order inference for one- and two-sample t-tests in high-dimensional data. Includes ordinary and moderated t-statistic and Welch t-test.
1 2 |
dat |
data matrix with rows corresponding to features. The number of
columns is a sample size and number of rows is a number of tests. If the
number of tests is |
a |
treatment vector; the length has to correspond to the number of
columns in |
side |
the test can be one-sided or two-sided. For a one-sided test, the
values are |
type |
type of the test with possible values |
unbiased.mom |
|
alpha |
significance level. |
ncheck |
number of intervals for tail diagnostic. |
lim |
tail region for tail diagnostic. Provide the endpoints for the right tail (positive values). |
Unadjusted p-values are calculated for five orders of approximation for
ordinary and moderated (empirical Bayes method) t-statistics; prior
information and moderated t-statistics are calculated with limma
package. If prior degrees of freedom is Inf
, higher orders are
provided for ordinary t-statistic only. In a two-sample test, when the
variances (and distributions) are not assumed to be equal and Welch t-test is
performed, only results for ordinary t-statistic are provided. Variance
adjustment is used for all the orders (see the paper) and therefore even
first-order results might differ slightly from the regular Student's
t-distribution approximation. When a first-order p-value (for moderated
t-statistic if relevant) is greater than provided significance level
alpha
, no higher order inference is calculated.
Tail diagnostic investigating Edgeworth expansion (EE) tail behavior is performed for each relevant feature (row of data); if EE of a particular order is not determined to be helpful, p-value of a previous order is provided in its place.
For better performance of a second order, using unbiased.mom = TRUE
is
recommended (default). For variance estimate, posterior variance is used for
moderated t-statistic and unbiased/pooled variance for ordinary t.
A matrix with the same number of rows as dat
, each row
providing p-values for five orders of Edgeworth expansions (0 - 4-term
expansions) for a corresponding feature (row of data). Where applicable,
p-values will be provided for both ordinary and moderated t-statistics (10
columns, five orders each); for Welch t-test the matrix will have five
columns, and if prior degrees of freedom is Inf
, only first order
p-values are returned for moderated t-statistic (six columns); note that
variance adjustment r^2 is 1 in that case.
tailDiag
for tail daignostic, makeFx
,
Ftshort
, and Ftgen
for calculating Edgeworth
expansions of orders 1 to 5, and smpStats
for extracting
statistics needed to calculate EE from a sample.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | # simulate a data set
nx <- 10 # sample size
m <- 1e4 # number of tests
ns <- 0.05*m # number of significant features
dat <- matrix(rgamma(m*nx, shape = 3) - 3, nrow = m)
shifts <- runif(ns, 1, 5)
dat[1:ns, ] <- dat[1:ns, ] - shifts
# run
res <- empEdge(dat)
head(res, 3)
# one test (data not high-dimensional)
empEdge(dat[1, ], side = "left", unbiased.mom = FALSE, alpha = 0.1)
# Welch test
ny <- 12
dat2 <- cbind(matrix(rnorm(m*ny), nrow = m), dat)
treat <- rep(0:1, c(ny, nx))
res <- empEdge(dat2, treat, type = "Welch", ncheck = 50, lim = c(1, 10))
head(res, 3)
# prior degrees of freedom not finite
if (require(limma)) {
d0 <- 0
while (is.finite(d0)) {
dat <- matrix(rnorm(m*nx), nrow = m)
dat[1:ns, ] <- dat[1:ns, ] + shifts
fit <- lmFit(dat, rep(1, nx))
d0 <- ebayes(fit)$df.prior
}
}
res <- empEdge(dat, side = "right")
head(res, 3)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.