auroc: Area Under the Curve (AUC) and Receiver Operating...

View source: R/auroc.R

aurocR Documentation

Area Under the Curve (AUC) and Receiver Operating Characteristic (ROC) curves for supervised classification

Description

Calculates the AUC and plots ROC for supervised models from s/plsda, mint.s/plsda and block.plsda, block.splsda or wrapper.sgccda functions.

Usage

auroc(object, ...)

## S3 method for class 'mixo_plsda'
auroc(
  object,
  newdata = object$input.X,
  outcome.test = as.factor(object$Y),
  multilevel = NULL,
  plot = TRUE,
  roc.comp = NULL,
  title = NULL,
  print = TRUE,
  ...
)

## S3 method for class 'mixo_splsda'
auroc(
  object,
  newdata = object$input.X,
  outcome.test = as.factor(object$Y),
  multilevel = NULL,
  plot = TRUE,
  roc.comp = NULL,
  title = NULL,
  print = TRUE,
  ...
)

## S3 method for class 'list'
auroc(object, plot = TRUE, roc.comp = NULL, title = NULL, print = TRUE, ...)

## S3 method for class 'mint.plsda'
auroc(
  object,
  newdata = object$X,
  outcome.test = as.factor(object$Y),
  study.test = object$study,
  multilevel = NULL,
  plot = TRUE,
  roc.comp = NULL,
  roc.study = "global",
  title = NULL,
  print = TRUE,
  ...
)

## S3 method for class 'mint.splsda'
auroc(
  object,
  newdata = object$X,
  outcome.test = as.factor(object$Y),
  study.test = object$study,
  multilevel = NULL,
  plot = TRUE,
  roc.comp = NULL,
  roc.study = "global",
  title = NULL,
  print = TRUE,
  ...
)

## S3 method for class 'sgccda'
auroc(
  object,
  newdata = object$X,
  outcome.test = as.factor(object$Y),
  multilevel = NULL,
  plot = TRUE,
  roc.block = 1L,
  roc.comp = NULL,
  title = NULL,
  print = TRUE,
  ...
)

## S3 method for class 'mint.block.plsda'
auroc(
  object,
  newdata = object$X,
  study.test = object$study,
  outcome.test = as.factor(object$Y),
  multilevel = NULL,
  plot = TRUE,
  roc.block = 1,
  roc.comp = NULL,
  title = NULL,
  print = TRUE,
  ...
)

## S3 method for class 'mint.block.splsda'
auroc(
  object,
  newdata = object$X,
  study.test = object$study,
  outcome.test = as.factor(object$Y),
  multilevel = NULL,
  plot = TRUE,
  roc.block = 1,
  roc.comp = NULL,
  title = NULL,
  print = TRUE,
  ...
)

Arguments

object

Object of class inherited from one of the following supervised analysis function: "plsda", "splsda", "mint.plsda", "mint.splsda", "block.splsda" or "wrapper.sgccda". Alternatively, this can be a named list of plsda and splsda objects if multiple models are to be compared. Note that these multiple models need to have used the same levels in the response variable.

...

external optional arguments for plotting - line.col for custom colors and legend.title for custom legend title

newdata

numeric matrix of predictors, by default set to the training data set (see details).

outcome.test

Either a factor or a class vector for the discrete outcome, by default set to the outcome vector from the training set (see details).

multilevel

Sample information when a newdata matrix is input and when multilevel decomposition for repeated measurements is required. A numeric matrix or data frame indicating the repeated measures on each individual, i.e. the individuals ID. See examples in splsda.

plot

Whether the ROC curves should be plotted, by default set to TRUE (see details).

roc.comp

Specify the component (integer) up to which the ROC will be calculated and plotted from the multivariate model, default to 1.

title

Character, specifies the title of the plot.

print

Logical, specifies whether the output should be printed.

study.test

For MINT objects, grouping factor indicating which samples of newdata are from the same study. Overlap with object$study are allowed.

roc.study

Specify the study for which the ROC will be plotted for a mint.plsda or mint.splsda object, default to "global".

roc.block

Specify the block number (integer) or the name of the block (set of characters) for which the ROC will be plotted for a block.plsda or block.splsda object, default to 1.

Details

For more than two classes in the categorical outcome Y, the AUC is calculated as one class vs. the other and the ROC curves one class vs. the others are output.

The ROC and AUC are calculated based on the predicted scores obtained from the predict function applied to the multivariate methods (predict(object)$predict). Our multivariate supervised methods already use a prediction threshold based on distances (see predict) that optimally determine class membership of the samples tested. As such AUC and ROC are not needed to estimate the performance of the model (see perf, tune that report classification error rates). We provide those outputs as complementary performance measures.

The pvalue is from a Wilcoxon test between the predicted scores between one class vs the others.

External independent data set (newdata) and outcome (outcome.test) can be input to calculate AUROC. The external data set must have the same variables as the training data set (object$X).

If object is a named list of multiple plsda and splsda objects, ensure that these models each have a response variable with the same levels. Additionally, newdata and outcome.test cannot be passed to this form of auroc.

If newdata is not provided, AUROC is calculated from the training data set, and may result in overfitting (too optimistic results).

Note that for mint.plsda and mint.splsda objects, if roc.study is different from "global", then newdata), outcome.test and sstudy.test are not used.

Value

Depending on the type of object used, a list that contains: The AUC and Wilcoxon test pvalue for each 'one vs other' classes comparison performed, either per component (splsda, plsda, mint.plsda, mint.splsda), or per block and per component (wrapper.sgccda, block.plsda, blocksplsda).

Author(s)

Benoit Gautier, Francois Bartolo, Florian Rohart, Al J Abadi

See Also

tune, perf, and http://www.mixOmics.org for more details.

Examples

## example with PLSDA, 2 classes
# ----------------
data(breast.tumors)
X <- breast.tumors$gene.exp
Y <- breast.tumors$sample$treatment

plsda.breast <- plsda(X, Y, ncomp = 2)
auc.plsda.breast = auroc(plsda.breast, roc.comp = 1)
auc.plsda.breast = auroc(plsda.breast, roc.comp = 2)

## Not run: 
## example with sPLSDA
# -----------------
splsda.breast <- splsda(X, Y, ncomp = 2, keepX = c(25, 25))
auroc(plsda.breast, plot = FALSE)


## example with sPLSDA with 4 classes
# -----------------
data(liver.toxicity)
X <- as.matrix(liver.toxicity$gene)
# Y will be transformed as a factor in the function,
# but we set it as a factor to set up the colors.
Y <- as.factor(liver.toxicity$treatment[, 4])

splsda.liver <- splsda(X, Y, ncomp = 2, keepX = c(20, 20))
auc.splsda.liver = auroc(splsda.liver, roc.comp = 2)


## example with mint.plsda
# -----------------
data(stemcells)

res = mint.plsda(X = stemcells$gene, Y = stemcells$celltype, ncomp = 3,
study = stemcells$study)
auc.mint.pslda = auroc(res, plot = FALSE)

## example with mint.splsda
# -----------------
res = mint.splsda(X = stemcells$gene, Y = stemcells$celltype, ncomp = 3, keepX = c(10, 5, 15),
study = stemcells$study)
auc.mint.spslda = auroc(res, plot = TRUE, roc.comp = 3)


## example with block.plsda
# ------------------
data(nutrimouse)
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid)
# with this design, all blocks are connected
design = matrix(c(0,1,1,0), ncol = 2, nrow = 2,
byrow = TRUE, dimnames = list(names(data), names(data)))

block.plsda.nutri = block.plsda(X = data, Y = nutrimouse$diet)
auc.block.plsda.nutri = auroc(block.plsda.nutri, roc.block = 'lipid')

## example with block.splsda
# ---------------
list.keepX = list(gene = rep(10, 2), lipid = rep(5,2))
block.splsda.nutri = block.splsda(X = data, Y = nutrimouse$diet, keepX = list.keepX)
auc.block.splsda.nutri = auroc(block.splsda.nutri, roc.block = 1)

## End(Not run)

mixOmicsTeam/mixOmics documentation built on Dec. 3, 2024, 11:15 p.m.