mean_squared_error: Accuracy Measures for Ordered Probability Predictions
In ocf: Ordered Correlation Forest

mean_squared_error

R Documentation

Accuracy Measures for Ordered Probability Predictions

Description

Accuracy measures for evaluating ordered probability predictions.

Usage

mean_squared_error(y, predictions, use.true = FALSE)

mean_absolute_error(y, predictions, use.true = FALSE)

mean_ranked_score(y, predictions, use.true = FALSE)

classification_error(y, predictions)

Arguments

`y`	Either the observed outcome vector or a matrix of true probabilities.
`predictions`	Predictions.
`use.true`	If `TRUE`, then the program treats `y` as a matrix of true probabilities.

Details

MSE, MAE, and RPS

When calling one of mean_squared_error, mean_absolute_error, or mean_ranked_score, predictions must be a matrix of predicted class probabilities, with as many rows as observations in y and as many columns as classes of y.

If use.true == FALSE, the mean squared error (MSE), the mean absolute error (MAE), and the mean ranked probability score (RPS) are computed as follows:

MSE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M (1 (Y_i = m) - \hat{p}_m (x))^2

MAE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M |1 (Y_i = m) - \hat{p}_m (x)|

RPS = \frac{1}{n} \sum_{i = 1}^n \frac{1}{M - 1} \sum_{m = 1}^M (1 (Y_i \leq m) - \hat{p}_m^* (x))^2

If use.true == TRUE, the MSE, the MAE, and the RPS are computed as follows (useful for simulation studies):

MSE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M (p_m (x) - \hat{p}_m (x))^2

MSE = \frac{1}{n} \sum_{i = 1}^n \sum_{m = 1}^M |p_m (x) - \hat{p}_m (x)|

RPS = \frac{1}{n} \sum_{i = 1}^n \frac{1}{M - 1} \sum_{m = 1}^M (p_m^* (x) - \hat{p}_m^* (x))^2

where:

p_m (x) = P(Y_i = m | X_i = x)

p_m^* (x) = P(Y_i \leq m | X_i = x)

Classification error

When calling classification_error, predictions must be a vector of predicted class labels.

Classification error (CE) is computed as follows:

CE = \frac{1}{n} \sum_{i = 1}^n 1 (Y_i \neq \hat{Y}_i)

where Y_i are the observed class labels.

Value

The MSE, the MAE, the RPS, or the CE of the method.

Author(s)

Riccardo Di Francesco

Examples

## Generate synthetic data.
set.seed(1986)

data <- generate_ordered_data(100)
sample <- data$sample
Y <- sample$Y
X <- sample[, -1]

## Training-test split.
train_idx <- sample(seq_len(length(Y)), floor(length(Y) * 0.5))

Y_tr <- Y[train_idx]
X_tr <- X[train_idx, ]

Y_test <- Y[-train_idx]
X_test <- X[-train_idx, ]

## Fit ocf on training sample.
forests <- ocf(Y_tr, X_tr)

## Accuracy measures on test sample.
predictions <- predict(forests, X_test)

mean_squared_error(Y_test, predictions$probabilities)
mean_ranked_score(Y_test, predictions$probabilities)
classification_error(Y_test, predictions$classification)

ocf documentation built on April 4, 2025, 4:44 a.m.