ccdf_testing: Main function to perform complex hypothesis testing using...

Description Usage Arguments Value References Examples

View source: R/ccdf_testing.R

Description

Main function to perform complex hypothesis testing using (un)conditional independence test

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
ccdf_testing(
  exprmat = NULL,
  variable2test = NULL,
  covariate = NULL,
  distance = c("L2", "L1", "L_sup"),
  test = c("asymptotic", "permutations", "dist_permutations"),
  method = c("linear regression", "logistic regression", "RF"),
  fast = TRUE,
  n_perm = 100,
  n_perm_adaptive = c(100, 150, 250, 500),
  thresholds = c(0.1, 0.05, 0.01),
  parallel = TRUE,
  n_cpus = NULL,
  adaptive = FALSE,
  space_y = FALSE,
  number_y = ncol(exprmat)
)

Arguments

exprmat

a data frame of size G x n containing the preprocessed expressions from n samples (or cells) for G genes. Default is NULL.

variable2test

a data frame of numeric or factor vector(s) of size n containing the variable(s) to be tested (the condition(s))

covariate

a data frame of numeric or factor vector(s) of size n containing the covariate(s)

distance

a character string indicating which distance to use to compute the test, either 'L2', 'L1' or 'L_sup', when method is 'dist_permutations', Default is 'L2'.

test

a character string indicating which method to use to compute the test, either 'asymptotic', 'permutations' or 'dist_permutations'. 'dist_permutations' allows to compute the distance between the CDF and the CCDF or two CCDFs. Default is 'asymptotic'.

method

a character string indicating which method to use to compute the CCDF, either 'linear regression', 'logistic regression' and 'permutations' or 'RF' for Random Forests. Default is 'linear regression' since it is the method used in the test.

fast

a logical flag indicating whether the fast implementation of logistic regression should be used. Only if 'dist_permutations' is specified. Default is TRUE.

n_perm

the number of permutations. Default is 100.

n_perm_adaptive

a vector of the increasing numbers of adaptive permutations when adaptive is TRUE. length(n_perm_adaptive) should be equal to length(thresholds)+1. Default is c(0.1,0.05,0.01).

thresholds

a vector of the decreasing thresholds to compute adaptive permutations when adaptive is TRUE. length(thresholds) should be equal to length(n_perm_adaptive)-1. Default is c(100,150,250,500).

parallel

a logical flag indicating whether parallel computation should be enabled. Default is TRUE.

n_cpus

an integer indicating the number of cores to be used when parallel is TRUE. Default is parallel::detectCores() - 1.

adaptive

a logical flag indicating whether adaptive permutations should be performed. Default is FALSE.

space_y

a logical flag indicating whether the y thresholds are spaced. When space_y is TRUE, a regular sequence between the minimum and the maximum of the observations is used. Default is FALSE.

number_y

an integer value indicating the number of y thresholds (and therefore the number of regressions) to perform the test. Default is ncol(exprmat).

Value

A list with the following elements:

References

Gauthier M, Agniel D, ThiƩbaut R & Hejblum BP (2019). Distribution-free complex hypothesis testing for single-cell RNA-seq differential expression analysis, *bioRxiv* 445165. [DOI: 10.1101/2021.05.21.445165](https://doi.org/10.1101/2021.05.21.445165).

Examples

1
2
3
4
5
X <- as.factor(rbinom(n=100, size = 1, prob = 0.5))
Y <- t(replicate(10, ((X==1)*rnorm(n = 50,0,1)) + ((X==0)*rnorm(n = 50,0.5,1))))
res_asymp <- ccdf_testing(exprmat=data.frame(Y=Y), 
variable2test=data.frame(X=X), test="asymptotic",
n_cpus=1)$pvals # asymptotic test

ccdf documentation built on Sept. 24, 2021, 9:07 a.m.