calc_one_v_rest_auc: Calculating area under Precision-Recall curve (PRC) and...
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

calc_one_v_rest_auc

R Documentation

Calculating area under Precision-Recall curve (PRC) and Receiver-Operator characteristic curve (ROC) for all one-vs-rest comparisons in the fitted model

Description

Calculating area under Precision-Recall curve (PRC) and Receiver-Operator characteristic curve (ROC) for all one-vs-rest comparisons in the fitted model

Usage

calc_one_v_rest_auc(
  fit = NULL,
  Xnew = NULL,
  Ynew = NULL,
  normalize_rows = NULL,
  measure = c("PRC", "ROC"),
  fitted_prob = NULL,
  include_baseline = TRUE,
  ...
)

Arguments

`fit`	fitted hidden genome classifier object. Experimental: can be NULL, in which case `fitted_prob` and `Ynew` must be provided.
`Xnew`, `Ynew`	New predictor design matrix and corresponding cancer site labels. If provided, the trained hidden genome model (supplied through `fit`) is used to obtain predicted probabilities based on `Xnew` and the resulting resulting probabilities are used as `fitted_prob`, along with `Ynew` to calculate the AUCs. If `Xnew` is supplied, then `Ynew` must also be supplied. If `fitted_prob` is supplied, then `Xnew` is ignored.
`normalize_rows`	vector of the same length as `nrow(Xnew)` to be used to normalize the rows of `Xnew`. If NULL (default), no normalization is performed.
`measure`	Type of curve to use. Options include "PRC" (Precision Recall Curve) and "ROC" (Receiver Operator characteristic Curve). Can be a vector.
`fitted_prob`	an n_tumor x n_cancer matrix of predicted classification probabilities of (corresponding to the "true" class labels provided in `Ynew`, if supplied, or the original training Y labels, as stored in the trained model) to use for calculating ROC/PRC AUCs, where n_tumor denotes the number of tumor/sample units, and n_cancer is the number of cancer sites in the fitted hidden genome model (supplied through `"fit"`). Row names and column names must be identical to the the tumor/sample names and cancer labels in `Ynew` (if supplied) or as used in the fitted model. If `NULL` (default) then the fitted probabilities are obtained from the model itself by either extracting pre-validated predictive probabilities (only available for mlogit models), or simply using the fitted model to make predictions on the training set.
`include_baseline`	logical. Along with the computed observed value(s) of the measure(s) should the null baseline value(s) be returned. Here null baseline refers to the expected value of the corresponding measure associated with a "baseline" classifier that (uniform) randomly assigns class labels to the sample units.

Details

Under the hood, the function uses several functions from R package precrec to compute the performance metrics. The argument fitted_prob, when supplied, should ideally contain predictive probabilities for training set tumors evaluated under a cross-validation framework. If not supplied, pre-validated prediction probabilities extracted from mlogit models, and overoptimistic prediction probabilities (obtained by simply using the fitted model on the training data) for other models are used.

Value

Returns a data.table with length(measure) + 1 columns ("Class" and measure(s)) (2 * length(measure) + 1 many columns if include_baseline = TRUE) and n_class + 1 many rows, where n_class denotes the number of cancer types present in the fitted model; the final row provides the Macro (average) metrics.

Note

The function uses package precrec under the hood to compute the AUCs. Please install precrec before using calc_one_v_rest_auc.

Examples

data("impact")
top_v <- variant_screen_mi(
  maf = impact,
  variant_col = "Variant",
  cancer_col = "CANCER_SITE",
  sample_id_col = "patient_id",
  mi_rank_thresh = 50,
  return_prob_mi = FALSE
)
var_design <- extract_design(
  maf = impact,
  variant_col = "Variant",
  sample_id_col = "patient_id",
  variant_subset = top_v
)

canc_resp <- extract_cancer_response(
  maf = impact,
  cancer_col = "CANCER_SITE",
  sample_id_col = "patient_id"
)
pid <- names(canc_resp)
# create five stratified random folds
# based on the response cancer categories
set.seed(42)
folds <- data.table::data.table(
  resp = canc_resp
)[,
  foldid := sample(rep(1:5, length.out = .N)),
  by = resp
]$foldid

# 80%-20% stratified separation of training and
# test set tumors
idx_train <- pid[folds != 5]
idx_test <- pid[folds == 5]

# train a classifier on the training set
# using only variants (will have low accuracy
# -- no meta-feature information used
fit0 <- fit_mlogit(
  X = var_design[idx_train, ],
  Y = canc_resp[idx_train]
)

calc_one_v_rest_auc(fit0)
calc_one_v_rest_auc(fit0, measure = "PRC")
calc_one_v_rest_auc(fit0, measure = "ROC")

c7rishi/hidgenclassifier documentation built on June 14, 2024, 11:10 a.m.

c7rishi/hidgenclassifier index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

c7rishi/hidgenclassifier
Functions for Bayesian hierarchical hidden genome classifier

calc_one_v_rest_auc: Calculating area under Precision-Recall curve (PRC) and...
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

Calculating area under Precision-Recall curve (PRC) and Receiver-Operator characteristic curve (ROC) for all one-vs-rest comparisons in the fitted model

Description

Usage

Arguments

Details

Value

Note

Examples

Related to calc_one_v_rest_auc in c7rishi/hidgenclassifier...

R Package Documentation

Browse R Packages

We want your feedback!

c7rishi/hidgenclassifier Functions for Bayesian hierarchical hidden genome classifier

calc_one_v_rest_auc: Calculating area under Precision-Recall curve (PRC) and... In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier

Calculating area under Precision-Recall curve (PRC) and Receiver-Operator characteristic curve (ROC) for all one-vs-rest comparisons in the fitted model

Description

Usage

Arguments

Details

Value

Note

Examples

Related to calc_one_v_rest_auc in c7rishi/hidgenclassifier...

R Package Documentation

Browse R Packages

We want your feedback!

c7rishi/hidgenclassifier
Functions for Bayesian hierarchical hidden genome classifier

calc_one_v_rest_auc: Calculating area under Precision-Recall curve (PRC) and...
In c7rishi/hidgenclassifier: Functions for Bayesian hierarchical hidden genome classifier