prediction: Class prediction on OOB set

View source: R/prediction.R

predictionR Documentation

Class prediction on OOB set

Description

Functions to predict class labels on the Out-Of-Bag (test) set for different classifiers.

Usage

prediction(
  mod,
  data,
  class = NULL,
  test.id = NULL,
  train.id = NULL,
  threshold = 0,
  standardize = FALSE,
  ...
)

## Default S3 method:
prediction(
  mod,
  data,
  class = NULL,
  test.id = NULL,
  train.id = NULL,
  threshold = 0,
  standardize = FALSE,
  ...
)

## S3 method for class 'pamrtrained'
prediction(
  mod,
  data,
  class = NULL,
  test.id = NULL,
  train.id = NULL,
  threshold = 0,
  standardize = FALSE,
  ...
)

## S3 method for class 'knn'
prediction(
  mod,
  data,
  class = NULL,
  test.id = NULL,
  train.id = NULL,
  threshold = 0,
  standardize = FALSE,
  ...
)

Arguments

mod

model object from classification()

data

data frame with rows as samples, columns as features

class

true/reference class vector used for supervised learning

test.id

integer vector of indices for test set. If NULL (default), all samples are used.

train.id

integer vector of indices for training set. If NULL (default), all samples are used.

threshold

a number between 0 and 1 indicating the lowest maximum class probability below which a sample will be unclassified.

standardize

logical; if TRUE, the training sets are standardized on features to have mean zero and unit variance. The test sets are standardized using the vectors of centers and standard deviations used in corresponding training sets.

...

additional arguments to be passed to or from methods

Details

The knn and pamr prediction methods use the train.id and class arguments for additional modelling steps before prediction. For knn, the modelling and prediction are performed in one step, so the function takes in both training and test set identifiers. For pamr, the classifier needs to be cross-validated on the training set in order to find a shrinkage threshold with the minimum CV error to use in prediction on the test set. The other prediction methods make use of the default method.

Value

A factor of predicted classes with labels in the same order as true class. If mod is a "pamr" classifier, the return value is a list of length 2: the predicted class, and the threshold value.

Author(s)

Derek Chiu

Examples

data(hgsc)
class <- attr(hgsc, "class.true")
set.seed(1)
training.id <- sample(seq_along(class), replace = TRUE)
test.id <- which(!seq_along(class) %in% training.id)
mod <- classification(hgsc[training.id, ], class[training.id], "slda")
pred <- prediction(mod, hgsc, class, test.id)
table(true = class[test.id], pred)

AlineTalhouk/splendid documentation built on Feb. 23, 2024, 9:37 p.m.