rotateStat: Generate data rotations and calculate statistics on it

View source: R/randRot.R

rotateStatR Documentation

Generate data rotations and calculate statistics on it

Description

This function generates rotations of data and calculates the provided statistic on each rotation and the non-rotated (original) data. This is the central function of the package.

Usage

rotateStat(
  initialised.obj,
  R = 10,
  statistic,
  ...,
  parallel = FALSE,
  BPPARAM = BiocParallel::bpparam()
)

Arguments

initialised.obj

An initialised random rotation object as returned by initRandrot and initBatchRandrot.

R

The number of resamples/rotations. Single numeric larger than 1.

statistic

A function which takes a data matrix (same dimensions as Y - see also initRandrot) as first argument and returns a statistic of interest. Any further arguments are passed to it with the ... argument. We highly recommend using pivotal quantities as statistic if possible (see also Details in pFdr). Note that pFdr considers larger values of statistics as more significant, so one-tailed tests may require reversal of the sign and two-tailed tests may require taking absolute values, see Examples. The results of statistic for each resample are finally combined with as.matrix and cbind, so ensure that statistic returns either a vector or a matrix. Results with multiple columns are possible and handled adequately in subsequent functions (e.g. pFdr). Note that statistic must not necessarily be of the same length as nrow(Y), but can also be e.g. a summary statistic of genes (like in gene set testing).

...

Further named arguments for statistic which are passed unchanged each time it is called. Avoid partial matching to arguments of rotateStat. See also the Examples.

parallel

logical if parallel computation should be performed, see details for use of parallel computing.

BPPARAM

An optional BiocParallelParam instance, see documentation of BiocParallel package of Bioconductor.

Details

The function takes an initialised randrot object (initRandrot) and a function that calculates a statistic on the data. The statistic function thereby takes the a matrix Y as first argument. Any further arguments are passed to it by ....

Together with pFdr, this function implements the workflow described in \insertCiteHettegger2021randRotation.

Be aware that only data is rotated (see also randrot), so any additional information including weights, X etc. need to be provided to statistic. See also package vignette and Examples.

Parallel processing is implemented with the BiocParallel package of Bioconductor. The default argument BiocParallel::bpparam() for BPPARAM returns the registered default backend. See package documentation for further information and usage options. If parallel = TRUE the function calls in statistic need to be called explicitly with package name and "::". So e.g. calling lmFit from the limma package is done with limma::lmFit(...), see also the examples in the package vignette.

Value

An object of class rotateStat.

Author(s)

Peter Hettegger

References

\insertAllCited

Examples

#set.seed(0)

# Dataframe of phenotype data (sample information)
# We simulate 2 sample classes processed in 3 batches
pdata <- data.frame(batch = rep(1:3, c(10,10,10)),
                   phenotype = rep(c("Control", "Cancer"), c(5,5)))
features <- 100

# Matrix with random gene expression data
edata <- matrix(rnorm(features * nrow(pdata)), features)
rownames(edata) <- paste("feature", 1:nrow(edata))

mod1 <- model.matrix(~phenotype, pdata)

# Initialisation of the random rotation class
init1 <- initBatchRandrot(Y = edata, X = mod1, coef.h = 2, batch = pdata$batch)
init1

# Definition of the batch effect correction procedure with subsequent calculation
# of two-sided test statistics
statistic <- function(., batch, mod, coef){

  # The "capture.output" and "suppressMessages" simply suppress any output
  capture.output(suppressMessages(
    Y.tmp <- sva::ComBat(., batch = batch, mod)
  ))

  fit1 <- lm.fit(mod, t(Y.tmp))
  abs(coef(fit1)[coef,])
}

# We calculate test statistics for the second coefficient

res1 <- rotateStat(initialised.obj = init1,
                    R = 10,
                    statistic = statistic,
                    batch = pdata$batch, mod = mod1, coef = 2)

hist(pFdr(res1))

phettegger/randRotation documentation built on April 10, 2023, 7:25 p.m.