smpStats: Sample statistics for Edgeworth expansions

View source: R/smpStats.R

smpStatsR Documentation

Sample statistics for Edgeworth expansions

Description

Calculate sample statistics needed for Edgeworth expansions.

Usage

smpStats(
  smp,
  a = NULL,
  type = NULL,
  unbiased.mom = TRUE,
  moder = FALSE,
  d0 = NULL,
  s20 = NULL,
  varpost = NULL
)

Arguments

smp

sample.

a

vector of the same length as smp specifying categories of observations (should contain two unique values). Treatment code is assumed to have a higher numeric value than control (relevant for type = "Welch").

type

type of the test with possible values "one-sample", "two-sample", and "Welch". For regular one- and two-sample tests the value is inferred from a but for Welch t-test it needs to be specified.

unbiased.mom

logical value indicating if unbiased estimators for third through sixth central moments should be used.

moder

logical value indicating if Edgeworth expansions for a moderated t-statistic will be used. If TRUE, prior information (d0 and s20) and posterior variance should be provided.

d0

prior degrees of freedom (needed if moder = TRUE).

s20

prior value for variance (needed if moder = TRUE).

varpost

posterior variance (needed if moder = TRUE).

Value

A named vector of sample statistics to be used in Edgeworth exansions. The calculated statistics and corresponding names are:

  • for ordinary one-sample t-statistic: scaled cumulants named "lam3", "lam4", "lam5", "lam6";

  • for moderated one-sample t-statistic: central moment estimates named "mu2", "mu3", "mu4", "mu5", "mu6", A, B, and prior degrees of freedom named "d0";

  • for ordinary two-sample t-statistic: central moment estimates repeated twice since the same distribution is assumed for two groups, named "mu_x2", "mu_x3", "mu_x4", "mu_x5", "mu_x6" and "mu_y2", "mu_y3", "mu_y4", "mu_y5", "mu_y6", A, B_x, B_y, b_x, and b_y;

  • for moderated two-sample t-statistic: estimates of the same quantities as for ordinary t (with different estimators); additionally, prior degrees of freedom named "d0" is included;

  • for Welch t-test: estimates of the same quantities as for ordinary t-statistic (with different estimators). In this case, central moment estimates for treatment and control groups are different.

See Also

tailDiag, makeFx, and makeQx for functions that require stats argument corresponding to the output of smpStats().

Examples

# simulate sample - one-sample test
n <- 10
smp <- rlnorm(n, sdlog = 0.6)  
stats <- smpStats(smp)
stats
t <- sqrt(n)*mean(smp)/sd(smp)
tailDiag(stats, n)
Ft <- makeFx(stats, n, base = "t")
Ft(t)

# two-sample test
n2 <- 8
smp2 <- c(smp, rnorm(n2))
a <- rep(0:1, c(n, n2))
smpStats(smp2, a, unbiased.mom = FALSE)

# moderated t-statistic
if (requireNamespace("limma")) {
  # simulate high-dimensional data
  m  <- 1e4          # number of tests
  ns <- 0.05*m       # number of significant features
  dat <- matrix(rgamma(m*n, shape = 3) - 3, nrow = m)
  shifts <- runif(ns, 1, 5)
  dat[1:ns, ] <- dat[1:ns, ] - shifts
  # estimate prior information
  fit <- limma::lmFit(dat, rep(1, n))
  fbay <- limma::eBayes(fit)
  # look at one feature (row of data)
  i <- 625
  smpStats(dat[i, ], moder = TRUE, d0 = fbay$df.prior, s20 = fbay$s2.prior, 
           varpost = fbay$s2.post[i])
}
  

innager/edgee documentation built on April 24, 2024, 8:14 p.m.