View source: R/plot_probabilities.R
plot_probabilities | R Documentation |
Creates a ggplot2
line plot object with the probabilities
of either the target classes or the predicted classes.
The observations are ordered by the highest probability.
TODO line geom: average probability per observation
TODO points geom: actual probabilities per observation
The meaning of the horizontal lines depend on the settings.
These are either recall scores, precision scores,
or accuracy scores, depending on the `probability_of`
and `apply_facet`
arguments.
plot_probabilities(
data,
target_col,
probability_cols,
predicted_class_col = NULL,
obs_id_col = NULL,
group_col = NULL,
probability_of = "target",
positive = 2,
order = "centered",
theme_fn = ggplot2::theme_minimal,
color_scale = ggplot2::scale_colour_brewer(palette = "Dark2"),
apply_facet = length(probability_cols) > 1,
smoothe = FALSE,
add_points = !is.null(obs_id_col),
add_hlines = TRUE,
add_caption = TRUE,
show_x_scale = FALSE,
line_settings = list(),
smoothe_settings = list(),
point_settings = list(),
hline_settings = list(),
facet_settings = list(),
ylim = c(0, 1)
)
data |
Example for binary classification:
Example for multiclass classification:
You can have multiple rows per observation ID per group. If, for instance, we have run repeated cross-validation of 3 classifiers, we would have one predicted probability per fold column per classifier. As created with the various validation functions in | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
target_col |
Name of column with target levels. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
probability_cols |
Name of columns with predicted probabilities. For binary classification, this should be one column with the probability of the second class (alphabetically). For multiclass classification, this should be one column per class.
These probabilities must sum to | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
predicted_class_col |
Name of column with predicted classes. This is required when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
obs_id_col |
Name of column with observation identifiers for grouping the x-axis.
When Use case: when you have multiple predicted probabilities per observation by a classifier (e.g. from repeated cross-validation). Can also be a grouping variable that you wish to aggregate. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
group_col |
Name of column with groups. The plot elements are split by these groups and can be identified by their color. E.g. the classifier responsible for the prediction. N.B. With more than | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
probability_of |
Whether to plot the probabilities of the
target classes ( For each row, we extract the probability of either the target class or the predicted class. Both are useful to plot, as they show the behavior of the classifier in a way a confusion matrix doesn't. One classifier might be very certain in its predictions (whether wrong or right), whereas another might be less certain. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
positive |
TODO | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
order |
How to order of the the probabilities. (Character) One of: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
theme_fn |
The | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
color_scale |
E.g. the output of
N.B. The number of colors in the object's palette should be at least the same as
the number of groups in the | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
apply_facet |
Whether to use
By default, faceting is applied when there are more than one probability column (multiclass). | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
smoothe |
Whether to use Settings can be passed via the | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
add_points |
Add a point for each predicted probability.
These are grouped on the x-axis by the | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
add_hlines |
Add horizontal lines. (Logical) The meaning of these lines depends on the
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
add_caption |
Whether to add a caption explaining the plot. This is dynamically generated and intended as a starting point. (Logical) You can overwrite the text with | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
show_x_scale |
TODO | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
line_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: N.B. Ignored when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
smoothe_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: N.B. Only used when | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
point_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hline_settings |
Named list of arguments for The Any argument not in the list will use its default value. Default: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
facet_settings |
Named list of arguments for The Any argument not in the list will use its default value. Commonly set arguments are | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ylim |
Limits for the y-scale. |
TODO
A ggplot2
object with a faceted line plot. TODO
Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk
Other plotting functions:
font()
,
plot_confusion_matrix()
,
plot_metric_density()
,
plot_probabilities_ecdf()
,
sum_tile_settings()
# Attach cvms
library(cvms)
library(ggplot2)
library(dplyr)
#
# Multiclass
#
# Plot probabilities of target classes
# From repeated cross-validation of three classifiers
# plot_probabilities(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "target"
# )
# Plot probabilities of predicted classes
# From repeated cross-validation of three classifiers
# plot_probabilities(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "prediction"
# )
# Center probabilities
# plot_probabilities(
# data = predicted.musicians,
# target_col = "Target",
# probability_cols = c("A", "B", "C", "D"),
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "prediction",
# order = "centered"
# )
#
# Binary
#
# Filter the predicted.musicians dataset
# binom_data <- predicted.musicians %>%
# dplyr::filter(
# Target %in% c("A", "B")
# ) %>%
# # "B" is the second class alphabetically
# dplyr::rename(Probability = B) %>%
# dplyr::mutate(`Predicted Class` = ifelse(
# Probability > 0.5, "B", "A")) %>%
# dplyr::select(-dplyr::all_of(c("A","C","D")))
# Plot probabilities of predicted classes
# From repeated cross-validation of three classifiers
# plot_probabilities(
# data = binom_data,
# target_col = "Target",
# probability_cols = "Probability",
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "target"
# )
# plot_probabilities(
# data = binom_data,
# target_col = "Target",
# probability_cols = "Probability",
# predicted_class_col = "Predicted Class",
# group_col = "Classifier",
# obs_id_col = "ID",
# probability_of = "prediction",
# ylim = c(0.5, 1)
# )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.