calculate_performance: Calculate performance measures

View source: R/calculate_performance.R

calculate_performanceR Documentation

Calculate performance measures

Description

Calculate performance measures from a given collection of p-values, adjusted p-values and scores provided in a COBRAData object.

Usage

calculate_performance(
  cobradata,
  binary_truth = NULL,
  cont_truth = NULL,
  aspects = c("fdrtpr", "fdrtprcurve", "fdrnbr", "fdrnbrcurve", "tpr", "fpr", "roc",
    "fpc", "overlap", "corr", "scatter", "deviation", "fsrnbr", "fsrnbrcurve"),
  thrs = c(0.01, 0.05, 0.1),
  svalthrs = c(0.01, 0.05, 0.1),
  splv = "none",
  maxsplit = 3,
  onlyshared = FALSE,
  thr_venn = 0.05,
  type_venn = "adjp",
  topn_venn = 100,
  rank_by_abs = TRUE,
  prefer_pval = TRUE
)

Arguments

cobradata

A COBRAData object.

binary_truth

A character string giving the name of the column of truth(cobradata) that contains the binary truth (true assignment of variables into two classes, represented by 0/1).

cont_truth

A character string giving the name of the column of truth(cobradata) that contains the continuous truth (a continuous value that the observations can be compared to).

aspects

A character vector giving the types of performance measures to calculate. Must be a subset of c("fdrtpr", "fdrtprcurve", "fdrnbr", "fdrnbrcurve", "tpr", "fpr", "roc", "fpc", "overlap", "corr", "scatter", "deviation", "fsrnbr", "fsrnbrcurve").

thrs

A numeric vector of adjusted p-value thresholds for which to calculate the performance measures. Affects "fdrtpr", "fdrnbr", "tpr" and "fpr".

svalthrs

A numeric vector of s-value thresholds for which to calculate the FSR. Affects "fsrnbr".

splv

A character string giving the name of the column of truth(cobradata) that will be used to stratify the results. The default value is "none", indicating no stratification.

maxsplit

A numeric value giving the maximal number of categories to keep in the stratification. The largest categories containing both positive and negative features will be retained. By setting this argument to 'Inf' or 'NA_integer_', all categories (as well as the order of categories) will be retained.

onlyshared

A logical, indicating whether to only consider features for which both the true assignment and a result (p-value, adjusted p-value or score) is given. If FALSE, all features contained in the truth table are used.

thr_venn

A numeric value giving the adjusted p-value threshold to use to create Venn diagrams (if type_venn is "adjp").

type_venn

Either "adjp" or "rank", indicating whether Venn diagrams should be constructed based on features with adjusted p-values below a certain threshold, or based on the same number of top-ranked features by different methods.

topn_venn

A numeric value giving the number of top-ranked features to compare between methods (if type_venn is "rank").

rank_by_abs

Whether to take the absolute value of the score before using it to rank the variables for ROC, FPC, FDR/NBR and FDR/TPR curves.

prefer_pval

Whether to preferentially rank variables by p-values or adjusted p-values rather than score for ROC and FPC calculations. From version 1.5.5, this is the default behaviour. To obtain the behaviour of previous versions, set to FALSE.

Details

Depending on the collection of observations that are available for a given method, the appropriate one will be chosen for each performance measure. For fpr, tpr, fdrtpr, fdrnbr and overlap aspects, results will only be calculated for methods where adjusted p-values are included in the COBRAData object, since these calculations make use of specific adjusted p-value cutoffs. For fdrtprcurve and fdrnbrcurve aspects, the score observations will be preferentially used, given that they are monotonically associated with the adjusted p-values (if provided). If the score is not provided, the nominal p-values will be used, given that they are monotonically associated with the adjusted p-values (if provided). In other cases, the adjusted p-values will be used also for these aspects. For roc and fpc, the score observations will be used if they are provided, otherwise p-values and, as a last instance, adjusted p-values. Finally, for the fsrnbr, corr, scatter and deviation aspects, the score observations will be used if they are provided, otherwise no results will be calculated.

Value

A COBRAPerformance object

Author(s)

Charlotte Soneson

Examples

data(cobradata_example)
cobraperf <- calculate_performance(cobradata_example,
                                   binary_truth = "status",
                                   aspects = c("fdrtpr", "fdrtprcurve",
                                               "tpr", "roc"),
                                   thrs = c(0.01, 0.05, 0.1), splv = "none")

markrobinsonuzh/iCOBRA documentation built on March 28, 2024, 2:01 p.m.