classificationReport: Prediction evaluation report of a classification model

View source: R/SEMhelp.R

classificationReportR Documentation

Prediction evaluation report of a classification model

Description

This function builds a report showing the main classification metrics. It provides an overview of key evaluation metrics like precision, recall, F1-score, accuracy, Matthew's correlation coefficient (mcc) and support (testing size) for each class in the dataset and averages (macro or weighted) for all classes.

Usage

classificationReport(yobs, yhat, CM = NULL, verbose = FALSE, ...)

Arguments

yobs

A vector with the true target variable values.

yhat

A matrix with the predicted target variables values.

CM

An optional (external) confusion matrix CxC.

verbose

A logical value (default = FALSE). If TRUE, the confusion matrix is printed on the screen, and if C=2, the density plots of the predicted probability for each group are also printed.

...

Currently ignored.

Details

Given one vector with the true target variable labels, and the a matrix with the predicted target variable values for each class, a series of classification metrics is computed. For example, suppose a 2x2 table with notation

Predicted
Observed Yes Event No Event
Yes Event A C
No Event B D

The formulas used here for the label = "Yes Event" are:

pre = A/(A+B)

rec = A/(A+C)

F1 = (2*pre*rec)/(pre+rec)

acc = (A+D)/(A+B+C+D)

mcc = (A*D-B*C)/sqrt((A+B)*(C+D)*(A+C)*(B+D))

Metrics analogous to those described above are calculated for the label "No Event", and the weighted average (averaging the support-weighted mean per label) and macro average (averaging the unweighted mean per label) are also provided.

Value

A list of 3 objects:

  1. "CM", the confusion matrix between observed and predicted counts.

  2. "stats", a data.frame with the classification evaluation statistics.

  3. "cls", a data.frame with the predicted probabilities, predicted labels and true labels of the categorical target variable.

Author(s)

Barbara Tarantino barbara.tarantino@unipv.it

References

Sammut, C. & Webb, G. I. (eds.) (2017). Encyclopedia of Machine Learning and Data Mining. New York: Springer. ISBN: 978-1-4899-7685-7

Examples



# Load Sachs data (pkc)
ig<- sachs$graph
data<- sachs$pkc
data<- transformData(data)$data
group<- sachs$group

#...with train-test (0.5-0.5) samples
set.seed(123)
train<- sample(1:nrow(data), 0.5*nrow(data))

#...with a categorical (as.factor) variable (C=2)
outcome<- factor(ifelse(group == 0, "control", "case"))
res<- SEMml(ig, data[train, ], outcome[train], algo="rf")
pred<- predict(res, data[-train, ], outcome[-train], verbose=TRUE)

yobs<- outcome[-train]
yhat<- pred$Yhat[ ,levels(outcome)]
cls<- classificationReport(yobs, yhat)
cls$CM
cls$stats
head(cls$cls)

#...with predicted probabiliy density plots, if C=2
cls<- classificationReport(yobs, yhat, verbose=TRUE)

#...with a categorical (as.factor) variable (C=3)
group[1:400]<- 2; table(group)
outcome<- factor(ifelse(group == 0, "control",
				ifelse(group == 1, "case1", "case2")))
res<- SEMml(ig, data[train, ], outcome[train], algo="rf")
pred<- predict(res, data[-train, ], outcome[-train], verbose=TRUE)

yobs<- outcome[-train]
yhat<- pred$Yhat[ ,levels(outcome)]
cls<- classificationReport(yobs, yhat)
cls$CM
cls$stats
head(cls$cls)



SEMdeep documentation built on April 12, 2025, 2:24 a.m.