error_calculator_comparison: Error calculation and validation metrics for topolow...

View source: R/error_metrics.R

error_calculator_comparisonR Documentation

Error calculation and validation metrics for topolow Calculate Comprehensive Error Metrics

Description

Computes a comprehensive set of error metrics (in-sample, out-of-sample, completeness) between predicted and true dissimilarities for model evaluation.

Usage

error_calculator_comparison(
  predicted_dissimilarities,
  true_dissimilarities,
  input_dissimilarities = NULL
)

Arguments

predicted_dissimilarities

Matrix of predicted dissimilarities from the model.

true_dissimilarities

Matrix of true, ground-truth dissimilarities.

input_dissimilarities

Matrix of input dissimilarities, which may contain NAs and is used to identify the pattern of missing values for out-of-sample error calculation. Optional - if not provided, defaults to true_dissimilarities (no holdout set).

Details

Input requirements and constraints:

  • All input matrices must have matching dimensions.

  • Row and column names must be consistent across matrices.

  • NAs are allowed and handled appropriately.

  • Threshold indicators (< or >) in the input matrix are processed correctly.

When input_dissimilarities is provided, it represents the training data where some values have been set to NA to create a holdout set. This allows calculation of:

  • In-sample errors: for data available during training

  • Out-of-sample errors: for data held out during training

When input_dissimilarities is NULL (default), all errors are treated as in-sample since no data was held out.

Value

A list containing:

report_df

A data.frame with detailed error metrics for each point-pair, including InSampleError, OutSampleError, and their percentage-based counterparts.

Completeness

A single numeric value representing the completeness statistic, which is the fraction of validation points for which a prediction could be made.

Examples

# Example 1: Normal evaluation (no cross-validation)
true_mat <- matrix(c(0, 1, 2, 1, 0, 3, 2, 3, 0), 3, 3)
pred_mat <- true_mat + rnorm(9, 0, 0.1)  # Add some noise

# Evaluate all predictions (input_dissimilarities defaults to true_dissimilarities)
errors1 <- error_calculator_comparison(pred_mat, true_mat)

# Example 2: Cross-validation evaluation
input_mat <- true_mat
input_mat[1, 3] <- input_mat[3, 1] <- NA  # Create holdout set

# Evaluate with train/test split
errors2 <- error_calculator_comparison(pred_mat, true_mat, input_mat)


topolow documentation built on Aug. 31, 2025, 1:07 a.m.