knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The metrica
package was developed to visualize and compute the level of agreement between observed ground-truth values and model-derived (e.g., mechanistic or empirical) predicted.
This package is intended to fit into the following workflow:
metrica
package is used to compute and evaluate the classification model based on observed and predicted values metrica
package is used to visualize model fit and selected fit metrics This vignette introduces the functionality of the metrica
package applied to observed and model-predicted values of a binary land cover classification scenario, where the two classes are vegetation (1) and non-vegetation (0)).
Let's begin by loading the packages needed.
library(metrica) library(dplyr) library(purrr) library(tidyr)
Figure 1. This binary classification dataset corresponds to a Random Forest model using a 70:30 training:testing split to predict vegetation vs. other land coverage. This exercise focused on pixel level classification. The image is showing the classification map where yellow areas are associated to non-vegetation pixels (other), and the green areas to those classified as vegetation.
Now we load the binary land_cover
data set already included with the metrica
package. This data set contains two columns:
predicted
: model-predicted (random forest) land cover, being vegetation = 1 and other = 0,
actual
: ground-truth observed land cover, being 0 = vegetation and 1 = other
# Load binary_landCover <- metrica::land_cover # Printing first observations head(binary_landCover)
Figure 2. This multiclass classification dataset corresponds to a Random Forest model using a 70:30 training:testing split to predict maize vegetation vs. other land coverage. This exercise focused on field level classification. The image is showing, in dark grey shapes, the fields used as the ground-truth locations to develop the model.
Now we load the multinomial maize_phenology
data set, which is also already included with the metrica
package. This multiclass data set presents 16 different classes corresponding to phenological stages of the maize (Zea Mays (L.)) crop.
# Load multi_maize_phen <- metrica::maize_phenology # Printing first observations head(multi_maize_phen)
The simplest way to visually assess agreement between observed and predicted classes is with a confusion matrix.
We can use the function confusion_matrix()
from the metrica package.
The function requires specifying:
the data frame object name (data
argument)
the name of the column containing observed values (obs
argument)
the name of the column containing predicted values (pred
argument)
The output of the confusion_matrix()
function is either a table (plot = FALSE
) or
a ggplot2
object (plot = TRUE
) that can be further customized:
# a. Print binary_landCover %>% confusion_matrix(obs = actual, pred = predicted, plot = FALSE, unit = "count") # b. Plot binary_landCover %>% confusion_matrix(obs = actual, pred = predicted, plot = TRUE, colors = c(low="#ffe8d6" , high="#892b64"), unit = "count") # c. Unit = proportion binary_landCover %>% confusion_matrix(obs = actual, pred = predicted, plot = TRUE, colors = c(low="#f9dbbd" , high="#892b64"), unit = "proportion")
# a. Print multi_maize_phen %>% confusion_matrix(obs = actual, pred = predicted, plot = FALSE, unit = "count") # b. Plot multi_maize_phen %>% confusion_matrix(obs = actual, pred = predicted, plot = TRUE, colors = c(low="grey85" , high="steelblue"), unit = "count")
The metrica package contains functions for 26 scoring rules to assess the agreement between observed and predicted values for classification data.
A list with all the the classification metrics including their name, definition, details, formula, and function name, please check here.
All of the metric functions take at least three arguments:
the data frame object name (data
argument, optional)
the name of the column containing observed values (obs
argument)
the name of the column containing predicted values (pred
argument)
an integer (1 or 2) indicating the alphanumerical order of the positive event (pos_level
argument, Default = 2)
a TRUE/FALSE indicating to estimate metrics for each single class (atom
argument, Default = FALSE). This argument is only functional for multiclass datasets.
a TRUE/FALSE indicating to store the numeric result as a list (tidy
argument, Default = FALSE), or as a data frame (tidy = TRUE).
The user can choose to calculate a single metric, or to calculate all metrics at once.
To calculate a single metric, the metric function can be called. For example, to calculate $accuracy$, we can use the accuracy()
function:
# Binary binary_landCover %>% accuracy(data = ., obs = actual, pred = predicted, tidy = TRUE) # Multiclass maize_phenology %>% accuracy(data = ., obs = actual, pred = predicted, tidy = TRUE)
Or considering imbalanced observations across classes we can call the balacc()
function for balanced accuracy:
# Binary binary_landCover %>% balacc(data = ., obs = actual, pred = predicted, tidy = TRUE) # Multiclass maize_phenology %>% balacc(data = ., obs = actual, pred = predicted, tidy = TRUE)
Similarly, to calculate precision, we can use the precision()
function:
# Binary binary_landCover %>% precision(data = ., obs = actual, pred = predicted, tidy = TRUE) # Multiclass maize_phenology %>% precision(data = ., obs = actual, pred = predicted, tidy = TRUE)
The user can also calculate all metrics at once using the function metrics_summary()
:
``` {r metrics_summary}
binary_landCover %>% metrics_summary(data = ., obs = actual, pred = predicted, type = "classification")
multi_maize_phen %>% metrics_summary(data = ., obs = actual, pred = predicted, type = "classification")
Alternatively, if the user is only looking for specific metrics, within the same function `metrics_summary()`, the user can pass a list of desired metrics using the argument "metrics_list" as follows: <br/> ``` {r metrics_summary_selected} # Get a selected list at once with metrics_summary() selected_class_metrics <- c("accuracy", "precision", "recall", "fscore") # Binary bin_sum <- binary_landCover %>% metrics_summary(data = ., obs = actual, pred = predicted, type = "classification", metrics_list = selected_class_metrics, pos_level = 1) # Multiclass multi_maize_phen %>% metrics_summary(data = ., obs = actual, pred = predicted, type = "classification", metrics_list = selected_class_metrics)
atom
= TRUE)For multiclass cases, most of the classification metrics can (and should) be estimated
at the class level and not simply as an overall average across classes. With the
exceptions of kappa (khat), mcc, fmi, and AUC_roc, all classification metrics can be
estimated at the class level using the argument atom = TRUE
, as follows:
# Precision maize_phenology %>% metrica::precision(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # Recall maize_phenology %>% metrica::recall(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # Specificity maize_phenology %>% metrica::specificity(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # atom = TRUE available for more functions available (remove #) # F-score # maize_phenology %>% metrica::fscore(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Adjusted F-score # maize_phenology %>% metrica::agf(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # G-mean # maize_phenology %>% metrica::gmean(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Negative predictive value # maize_phenology %>% metrica::npv(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Prevalence # maize_phenology %>% metrica::preval(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Prevalence threshold # maize_phenology %>% metrica::preval_t(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # False omission rate # maize_phenology %>% metrica::FOR(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # False detection rate # maize_phenology %>% metrica::FDR(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # False positive rate # maize_phenology %>% metrica::FPR(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Falase negative rate # maize_phenology %>% metrica::FNR(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Delta-p # maize_phenology %>% metrica::deltap(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Critical Success Index # maize_phenology %>% metrica::csi(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Bookmaker Informedness # maize_phenology %>% metrica::bmi(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Positive likelihood ratio # maize_phenology %>% metrica::posLr(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Negative likelihood ratio # maize_phenology %>% metrica::negLr(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE) # # Diagnostic odds ratio # maize_phenology %>% metrica::dor(obs = actual, pred = predicted, atom = TRUE, tidy = TRUE)
In some cases, multiple runs of a model are available to compare vs. observed values (e.g. cross-validation folds). Thus, we can also fit the agreement analysis for several datasets as follows:
set.seed(15) # Let's simulated two extra runs of the same model for Land Cover fold_2 <- data.frame(actual = sample(c(0,1), 285, replace = TRUE), predicted = sample(c(0,1), 285, replace = TRUE)) fold_3 <- data.frame(actual = sample(c(0,1), 285, replace = TRUE), predicted = sample(c(0,1), 285, replace = TRUE)) # a. Create nested df with the folds binary_nested_folds <- bind_rows(list(fold_1 = binary_landCover, fold_2 = fold_2, fold_3 = fold_3), .id = "id") %>% dplyr::group_by(id) %>% tidyr::nest() head(binary_nested_folds %>% group_by(id) %>% dplyr::slice_head(n=2)) # b. Run binary_folds_summary <- binary_nested_folds %>% # Store metrics in new.column "performance" dplyr::mutate(performance = purrr::map(data, ~metrica::metrics_summary(data = ., obs = actual, pred = predicted, type = "classification"))) %>% dplyr::select(-data) %>% tidyr::unnest(cols = performance) %>% dplyr::arrange(Metric) head(binary_folds_summary)
group_map()
non_nested_folds <- binary_nested_folds %>% unnest(cols = "data") # Using group_map() binary_folds_summary_2 <- non_nested_folds %>% dplyr::group_by(id) %>% dplyr::group_map(~metrics_summary(data = ., obs = actual, pred = predicted, type = "classification")) binary_folds_summary_2
summarise()
# Using summarise() binary_folds_summary_3 <- non_nested_folds %>% dplyr::group_by(id) %>% dplyr::summarise(metrics_summary(obs = actual, pred = predicted, type = "classification")) %>% dplyr::arrange(Metric) binary_folds_summary_3
To print the metrics on the confusion_matrix()
, just use print.metrics = TRUE. Warning: do not forget to specify your 'metrics.list' and choice wisely:
selected_metrics <- c("accuracy", "precision", "recall", "khat", "mcc", "fscore", "agf", "npv", "FPR", "FNR") binary_matrix_metrics <- binary_landCover %>% confusion_matrix(obs = actual, pred = predicted, plot = TRUE, colors = c(low="#ffe8d6" , high="#892b64"), unit = "count", # Print metrics_summary print_metrics = TRUE, # List of performance metrics metrics_list = selected_metrics, # Position (bottom or top) position_metrics = "bottom") binary_matrix_metrics multinomial_matrix_metrics <- maize_phenology %>% confusion_matrix(obs = actual, pred = predicted, plot = TRUE, colors = c(low="grey85" , high="steelblue"), unit = "count", # Print metrics_summary print_metrics = TRUE, # List of performance metrics metrics_list = selected_metrics, # Position (bottom or top) position_metrics = "bottom") multinomial_matrix_metrics
Also, as a ggplot element, outputs are flexible of further edition:
binary_matrix_metrics + # Modify labels ggplot2::labs(x = "Observed Vegetation", y = "Predicted Vegetation", title = "Binary Confusion Matrix") multinomial_matrix_metrics + # Modify labels ggplot2::labs(x = "Observed Corn Phenology", y = "Predicted Corn Phenology", title = "Multinomial Confusion Matrix")+ # Modify theme ggplot2::theme_light()
To export the metrics summary table, the user can simply write it to file with the function write.csv()
:
metrics_summary(data = binary_landCover, obs = obs, pred = pred, type = "classification") %>% write.csv("binary_landcover_metrics_summary.csv")
Similarly, to export a plot, the user can simply write it to file with the function ggsave()
:
ggsave(plot = multinomial_matrix_metrics, "multinomial_matrix_metrics.png", width = 8, height = 7)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.