BBUM_corr | R Documentation |
Fits the BBUM model on the dataset, and transforms raw p values into values corrected for false discovery rate (FDR) of null and secondary signal according to the BBUM model, as multiple testing correction. Optionally, it automatically detects extreme value outliers among the background set, and resolves correction issues by trimming the outliers.
BBUM_corr(
pvals,
signal_set,
add_starts = list(),
only_start = FALSE,
limits = list(),
pBBUM.alpha = 0.05,
auto_outliers = TRUE,
rthres = 1,
rtrimmax = 0.05,
atrimmax = 10,
two_tailed = FALSE,
quiet = FALSE
)
pvals |
A vector of all numerical p values, including both signal and background sets. |
signal_set |
A vector of booleans signifying which values among
|
add_starts |
List of named vectors for additional starts of fitting algorithm beyond the default set. |
only_start |
Whether the algorithm should only use the given starts
( |
limits |
Named list of custom limits for specific paramters. Parameters not mentioned would be default values. |
pBBUM.alpha |
Cutoff level of BBUM-FDR-adjusted p values for significance testing. Only used here to generate appropriate default limits. |
auto_outliers |
Toggle automatic outlier trimming. |
rthres |
Threshold value of |
rtrimmax |
Maximum fraction of data points allowed to be outliers in the background set of data (to be trimmed). |
atrimmax |
Maximum absolute number of data points allowed to be outliers in the background set of data (to be trimmed). |
two_tailed |
Toggle the "two-tailed" case of BBUM correction, if the background assumption is weak and bona fide hits in the background class are relevant. See Details. Default behavior is off. |
quiet |
Suppress printed messages and warnings. |
pBBUM
represents the expected overall FDR level if the cutoff
were set at that particular p value. This is similar to the interpretation
of p values corrected through the typical p.adjust(method = "fdr")
.
pBBUM
values are designed for the signal set p values only,
Values for the background set are given but not valid as significance
testing adjustment, and so should not be used to call any hits. They
are provided primarily to compare the equivalent transformation against the
signal set to assess the adjustment strategy. The background set should
not be considered for hits.
BBUM_corr
functions best with p values filtered for poor
quality data points in prior. Such points tend to have high p values and
may disrupt the uniform null distribution.
Default starts for BBUM fitting are implemented. If additional
starts should included, or only custom starts should be considered, make
use of add_starts
and/or only_start
arguments.
If more than one start achieved the identical likelihood, a random start is chosen among them.
Automatic outlier detection relies on the model fitting a value of
r > 1
. Such a result suggests that a stronger signal (presumably
outliers) exists in the background set than in the signal set, which
violates the assumptions of the model. This is a conservative strategy.
The ideal way to deal with outliers is to identify and handle them before
any statistical analyses. For benchmarking of the trimming strategy, see
Wang & Bartel, 2022.
Adding too many starts or allowing too much outlier trimming can increase computation time.
If the background assumption is weak, such that a small number
of bona fide hits are anticipated and relevant to the hypothesis at
hand among the data points designated "background class", the FDR could be
made to include the background class. This is akin to a two-tailed test
(despite a one-tailed assumption to begin with). This would allow the
generation of genuine FDR-corrected p values for the background class
points as well. Toggle this using the two_tailed
value.
Due to the asymptotic behavior of the function when any
p values = 0, any p values < .Machine$double.xmin*10
would be
constrained to .Machine$double.xmin*10
.
A named list with the following items:
pvals
: Vector of input p values.
pBBUMs
: Vector of p values corrected for FDR by BBUM modeling.
estim
: A named list of fitted parameter values.
LL
: Value of the maximized log-likelihood.
convergence
: Convergence code from optim
.
outlier_trim
: Number of outliers trimmed in the background set.
r.passed
: Boolean for whether the fitted r
value was under
the threshold for flagging outliers.
BBUM_corr(
pvals = c(0.501, 0.203, 0.109, 0.071, 0.019, 0.031, 0.001,
0.000021, 0.00010, 0.03910,
0.0001,
0.11, 0.27, 0.36, 0.43, 0.50, 0.61, 0.77, 0.87, 0.91,
0.13, 0.21, 0.38, 0.42, 0.52, 0.60, 0.73, 0.81, 0.97),
signal_set = c(FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, TRUE,
TRUE, TRUE, TRUE,
FALSE,
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,
TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE),
add_starts = list(c(lambda = 0.9, a = 0.6, theta = 0.1, r = 0.1)),
limits = list(a = c(0.1,0.7))
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.