Description Usage Arguments Details Value Author(s) Examples

Evaluate your model's predictions on a set of evaluation metrics.

Create ID-aggregated evaluations by multiple methods.

Currently supports regression and classification
(binary and multiclass). See `type`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |

`data` |
Data frame with predictions, targets and (optionally) an ID column.
Can be grouped with ## MultinomialWhen
## BinomialWhen
## GaussianWhen
| ||||||||||||||||||||||||||||||||||||||||

`target_col` |
Name of the column with the true classes/values in When | ||||||||||||||||||||||||||||||||||||||||

`prediction_cols` |
Name(s) of column(s) with the predictions. When evaluating a classification task, the column(s) should contain the predicted probabilities. | ||||||||||||||||||||||||||||||||||||||||

`type` |
Type of evaluation to perform:
| ||||||||||||||||||||||||||||||||||||||||

`id_col` |
Name of ID column to aggregate predictions by. N.B. Current methods assume that the target class/value is constant within the IDs. N.B. When aggregating by ID, some metrics (such as those from model objects) are excluded. | ||||||||||||||||||||||||||||||||||||||||

`id_method` |
Method to use when aggregating predictions by ID. Either When ## meanThe average prediction (value or probability) is calculated per ID and evaluated. This method assumes that the target class/value is constant within the IDs. ## majorityThe most predicted class per ID is found and evaluated. In case of a tie,
the winning classes share the probability (e.g. | ||||||||||||||||||||||||||||||||||||||||

`models` |
Unnamed list of fitted model(s) for calculating R^2 metrics and information criterion metrics. May only work for some types of models. When only passing one model, remember to pass it in a list (e.g. N.B. When N.B. When aggregating by ID (i.e. when N.B. Currently, | ||||||||||||||||||||||||||||||||||||||||

`apply_softmax` |
Whether to apply the softmax function to the
prediction columns when N.B. | ||||||||||||||||||||||||||||||||||||||||

`cutoff` |
Threshold for predicted classes. (Numeric) N.B. | ||||||||||||||||||||||||||||||||||||||||

`positive` |
Level from dependent variable to predict. Either as character or level index (1 or 2 - alphabetically). E.g. if we have the levels Used when calculating confusion matrix metrics and creating ROC curves. N.B. Only affects the evaluation metrics. N.B. | ||||||||||||||||||||||||||||||||||||||||

`metrics` |
List for enabling/disabling metrics. E.g. Also accepts the string N.B. Currently, disabled metrics are still computed. | ||||||||||||||||||||||||||||||||||||||||

`include_predictions` |
Whether to include the predictions in the output as a nested tibble. (Logical) | ||||||||||||||||||||||||||||||||||||||||

`parallel` |
Whether to run evaluations in parallel,
when |

Packages used:

**Gaussian**:

r2m : `MuMIn::r.squaredGLMM`

r2c : `MuMIn::r.squaredGLMM`

AIC : `stats::AIC`

AICc : `MuMIn::AICc`

BIC : `stats::BIC`

**Binomial** and **Multinomial**:

Confusion matrix and related metrics:
`caret::confusionMatrix`

ROC and related metrics: `pROC::roc`

MCC: `mltools::mcc`

—————————————————————-

—————————————————————-

Tibble containing the following metrics by default:

Average **RMSE**, **MAE**, **r2m**,
**r2c**, **AIC**, **AICc**, and **BIC**.

N.B. Some of the metrics will only be returned if model
objects were passed, and will be `NA`

if they could not be
extracted from the passed model objects.

Also includes:

A nested tibble with the **Predictions** and targets.

A nested tibble with the model **Coefficients**. The coefficients
are extracted from the model object with `broom::tidy()`

or
`coef()`

(with some restrictions on the output).
If these attempts fail, a default coefficients tibble filled with `NA`

s is returned.

—————————————————————-

—————————————————————-

Tibble with the following evaluation metrics, based on a confusion matrix and a ROC curve fitted to the predictions:

ROC:

**AUC**, **Lower CI**, and **Upper CI**

Confusion Matrix:

**Balanced Accuracy**,
**F1**,
**Sensitivity**,
**Specificity**,
**Positive Prediction Value**,
**Negative Prediction Value**,
**Kappa**,
**Detection Rate**,
**Detection Prevalence**,
**Prevalence**, and
**MCC** (Matthews correlation coefficient).

Other available metrics (disabled by default, see `metrics`

):
**Accuracy**.

Also includes:

A nested tibble with the **predictions** and targets.

A nested tibble with the sensativities and specificities from the **ROC** curve.

A nested tibble with the **confusion matrix**.
The `Pos_`

columns tells you whether a row is a
True Positive (TP), True Negative (TN), False Positive (FP), or False Negative (FN),
depending on which level is the "`positive`

" class. I.e. the level you wish to predict.

—————————————————————-

—————————————————————-

For each class, a *one-vs-all* binomial evaluation is performed. This creates
a **Class Level Results** tibble containing the same metrics as the binomial results
described above, along with the **Support** metric, which is simply a
count of the class in the target column. These metrics are used to calculate the macro metrics
in the output tibble. The nested class level results tibble is also included in the output tibble,
and would usually be reported along with the macro and overall metrics.

The output tibble contains the macro and overall metrics.
The metrics that share their name with the metrics in the nested
class level results tibble are averages of those metrics
(note: does not remove `NA`

s before averaging).
In addition to these, it also includes the **Overall Accuracy** metric.

Other available metrics (disabled by default, see `metrics`

):
**Accuracy**, **Weighted Balanced Accuracy**, **Weighted Accuracy**,
**Weighted F1**, **Weighted Sensitivity**, **Weighted Sensitivity**,
**Weighted Specificity**, **Weighted Pos Pred Value**,
**Weighted Neg Pred Value**, **Weighted AUC**, **Weighted Lower CI**,
**Weighted Upper CI**, **Weighted Kappa**, **Weighted MCC**,
**Weighted Detection Rate**, **Weighted Detection Prevalence**, and
**Weighted Prevalence**.

Note that the "Weighted" metrics are weighted averages, weighted by the `Support`

.

Also includes:

A nested tibble with the **Predictions** and targets.

A nested tibble with the multiclass **Confusion Matrix**.

**Class Level Results**

Besides the binomial evaluation metrics and the `Support`

metric,
the nested class level results tibble also contains:

A nested tibble with the sensativities and specificities from the **ROC** curve.

A nested tibble with the **Confusion Matrix** from the one-vs-all evaluation.
The `Pos_`

columns tells you whether a row is a
True Positive (TP), True Negative (TN), False Positive (FP), or False Negative (FN),
depending on which level is the "positive" class. In our case, `1`

is the current class
and `0`

represents all the other classes together.

Ludvig Renbo Olsen, [email protected]

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 | ```
# Attach packages
library(cvms)
library(dplyr)
# Load data
data <- participant.scores
# Fit models
gaussian_model <- lm(age ~ diagnosis, data = data)
binomial_model <- glm(diagnosis ~ score, data = data)
# Add predictions
data[["gaussian_predictions"]] <- predict(gaussian_model, data,
type = "response",
allow.new.levels = TRUE)
data[["binomial_predictions"]] <- predict(binomial_model, data,
allow.new.levels = TRUE)
# Gaussian evaluation
evaluate(data = data, target_col = "age",
prediction_cols = "gaussian_predictions",
models = list(gaussian_model),
type = "gaussian")
# Binomial evaluation
evaluate(data = data, target_col = "diagnosis",
prediction_cols = "binomial_predictions",
type = "binomial")
# Multinomial
# Create a tibble with predicted probabilities
data_mc <- multiclass_probability_tibble(
num_classes = 3, num_observations = 30,
apply_softmax = TRUE, FUN = runif,
class_name = "class_")
# Add targets
class_names <- paste0("class_", c(1,2,3))
data_mc[["target"]] <- sample(x = class_names,
size = 30, replace = TRUE)
# Multinomial evaluation
evaluate(data = data_mc, target_col = "target",
prediction_cols = class_names,
type = "multinomial")
# ID evaluation
# Gaussian ID evaluation
# Note that 'age' is the same for all observations
# of a participant
evaluate(data = data, target_col = "age",
prediction_cols = "gaussian_predictions",
id_col = "participant",
type = "gaussian")
# Binomial ID evaluation
evaluate(data = data, target_col = "diagnosis",
prediction_cols = "binomial_predictions",
id_col = "participant",
id_method = "mean", # alternatively: "majority"
type = "binomial")
# Multinomial ID evaluation
# Add IDs and new targets (must be constant within IDs)
data_mc[["target"]] <- NULL
data_mc[["id"]] <- rep(1:6, each = 5)
id_classes <- tibble::tibble(
"id" = 1:6,
target = sample(x = class_names, size = 6, replace = TRUE)
)
data_mc <- data_mc %>%
dplyr::left_join(id_classes, by = "id")
# Perform ID evaluation
evaluate(data = data_mc, target_col = "target",
prediction_cols = class_names,
id_col = "id",
id_method = "mean", # alternatively: "majority"
type = "multinomial")
# Training and evaluating a multinomial model with nnet
# Create a data frame with some predictors and a target column
class_names <- paste0("class_", 1:4)
data_for_nnet <- multiclass_probability_tibble(
num_classes = 3, # Here, number of predictors
num_observations = 30,
apply_softmax = FALSE,
FUN = rnorm,
class_name = "predictor_") %>%
dplyr::mutate(class = sample(
class_names,
size = 30,
replace = TRUE))
# Train multinomial model using the nnet package
mn_model <- nnet::multinom(
"class ~ predictor_1 + predictor_2 + predictor_3",
data = data_for_nnet)
# Predict the targets in the dataset
# (we would usually use a test set instead)
predictions <- predict(mn_model, data_for_nnet,
type = "probs") %>%
dplyr::as_tibble()
# Add the targets
predictions[["target"]] <- data_for_nnet[["class"]]
# Evaluate predictions
evaluate(data = predictions, target_col = "target",
prediction_cols = class_names,
type = "multinomial")
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.