smoothFDR: FDR Smoothing

Description Usage Arguments Details References Examples

View source: R/smoothFDR.R

Description

This function implements Tansey et al.'s (2018) FDR smoothing algorithm.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
smoothFDR(
  dat,
  probe = "probe",
  z = "z",
  pos = "pos",
  chr = "chr",
  nulltype = "empirical",
  nlambda = 30,
  tol = 1e-06,
  maxit = 100,
  parallel = TRUE
)

Arguments

dat

Data frame with columns for probe name, z-statistics, chromosomal position, and chromosome.

probe

String denoting column name for probes (CpGs, SNPs, etc.).

z

String denoting column name for z-scores.

pos

String denoting column name for chromosomal positions.

chr

String denoting column name for chromosomes.

nulltype

How should the null distribution be estimated? Choose "empirical" for Efron's central-matching method (default). Choose "theoretical" for a standard normal null.

nlambda

Length of the lambda sequence for the fused lasso subroutine.

tol

Convergence tolerance for the expectation maximization (EM) algorithm.

maxit

Maximum number of iterations for the EM algorithm.

parallel

Process in parallel? Only relevant if data spans multiple chromosomes. If TRUE, backend must be registered beforehand.

Details

FDR smoothing is an empirical Bayes method for exploiting spatial structure in large multiple-testing problems. The method automatically finds spatially localized regions of significant test statistics. It then relaxes the threshold of statistical significance within these regions and tightens it elsewhere, in a manner that controls the overall false discovery rate at a given level. This results in increased power and cleaner spatial separation of signals from noise. The approach requires solving a nonstandard high-dimensional optimization problem, for which an efficient augmented-Lagrangian algorithm is implemented. See (Tansey et al., 2018) for details.

References

Efron, B. (2004). Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis. JASA, 99(465), 96-104.

Newton, M.A. (2002). On a Nonparametric Recursive Estimator of the Mixing Distribution. Sankhyā: The Indian Journal of Statistics, Series A, 64(2), 306-322.

Tansey, W., Koyejo, O., Poldrack, R.A., & Scott, J.G. (2018). False Discovery Rate Smoothing. JASA, 113(523), 1156-1171.

Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and Smoothness via the Fused Lasso. J. R. Statist. Soc. B, 67(1), 91-108.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Import data
data('DNAm')

# Set seed
set.seed(123)

# Run FDR smoothing
res <- smoothFDR(DNAm, probe = 'cpg', parallel = FALSE)

# Compare q-values to Benjamini-Hochberg estimates
sum(res$BH_q.value <= 0.05)
sum(res$q.value <= 0.05)

dswatson/smoothFDR documentation built on March 4, 2020, 3:36 a.m.