#' Area under the precision-recall curve
#'
#' Calculates the Precision-Recall (PR) Area Under the Curve (AUC) given a vector of p-values and the true classes (causal (alternative) vs non-causal (null)).
#' This is a wrapper around [PRROC::pr.curve()], which actually calculates the AUC (see that for details).
#'
#' @param pvals The vector of association p-values to analyze.
#' `NA` values are allowed in input, are internally set to 1 (worst score) prior to AUC calculation (to prevent methods to get good AUCs by setting more cases to `NA`).
#' Non-`NA` values outside of \[0,1\] will trigger an error.
#' @param causal_indexes The vector of causal indexes, defining the true classes used for AUC calculation.
#' Values of `causal_indexes` as returned by `sim_trait` work.
#' There must be at least one causal index and at least one non-causal case.
#' @param curve If `FALSE` (default), only scalar AUC is returned.
#' If `TRUE`, then `curve = TRUE` is passed to [PRROC::pr.curve()] and the full object (class `PRROC`) is returned (see below).
#'
#' @return If `curve = FALSE`, returns the PR AUC scalar value.
#' If `curve = TRUE`, returns the `PRROC` object as returned by [PRROC::pr.curve()], which can be plotted directly, and which contains the AUC under the named value `auc.integral`.
#'
#' However, if the input `pvals` is `NULL` (taken for case of singular association test, which is rare but may happen), then the returned value is `NA`.
#'
#' @examples
#' # simulate truly null p-values, which should be uniform
#' pvals <- runif(10)
#' # for toy example, take the first two p-values to be truly causal
#' causal_indexes <- 1:2
#' # calculate desired measure
#' pval_aucpr( pvals, causal_indexes )
#'
#' @seealso
#' [PRROC::pr.curve()], which is used internally by this function.
#'
#' [pval_power_calib()] for calibrated power estimates.
#'
#' @export
pval_aucpr <- function(pvals, causal_indexes, curve = FALSE) {
if ( missing( pvals ) )
stop( '`pvals` is required!' )
if ( missing( causal_indexes ) )
stop( '`causal_indexes` is required!' )
if ( length( causal_indexes ) == 0 )
stop( '`causal_indexes` must have at least one index!' )
# in some cases there is nothing to do (LMM has singular information matrix)
# NA is best value to return in that case (scalar)
if ( is.null( pvals ) )
return( NA )
# sometimes p-values are missing (GCATest!), treat as worst possible value (p=1)
if ( anyNA( pvals ) )
pvals[ is.na( pvals ) ] <- 1
# check range of data here, to complain if it was bad
if ( any( pvals < 0 ) )
stop( 'Input p-values included negative values!' )
if ( any( pvals > 1 ) )
stop( 'Input p-values included values exceeding 1!' )
# turn p-values into "scores" (for some reason this is needed to have the correct PR curve; is it just the order flip?)
scores <- - log( pvals )
# separate scores for both classes
scores_alt <- scores[ causal_indexes ]
scores_nul <- scores[ -causal_indexes ]
# make sure both lists are non-empty
# scores_alt was tested through testing causal_indexes
# just test the other one
if ( length( scores_nul ) == 0 )
stop( 'There were no null (non-causal) cases!' )
# generate data, skip curve (default)
pr <- PRROC::pr.curve( scores_alt, scores_nul, curve = curve )
# return either the full object for plotting, or the AUC only (default)
return( if (curve) pr else pr$auc.integral )
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.