calculate_tv_distance_empirical: Calculate Total Variation (TV) Distance Empirically

View source: R/calculate_tv_distance_empirical.R

calculate_tv_distance_empiricalR Documentation

Calculate Total Variation (TV) Distance Empirically

Description

This function calculates the Total Variation (TV) distance between the empirical cumulative distribution functions (ECDFs) of two datasets: original data and generated data. The TV distance is defined as half the sum of the absolute differences between the two CDFs at each point in the domain.

Usage

calculate_tv_distance_empirical(original_data, generated_data)

Arguments

original_data

A numeric vector of the original data.

generated_data

A numeric vector of the generated data.

Value

A numeric value representing the Total Variation distance between the empirical CDFs of the original and generated data.

Examples

# Test Case 1: Data from similar distributions
original_data <- rnorm(1000, mean = 0, sd = 1)  # Normal distribution (mean = 0, sd = 1)
generated_data <- rnorm(1000, mean = 0, sd = 1)  # Similar normal distribution
tv_distance <- calculate_tv_distance_empirical(original_data, generated_data)
print(tv_distance)  # Expected to be close to 0, as both datasets are similar

# Test Case 2: Data from different distributions
original_data <- rnorm(1000, mean = 0, sd = 1)  # Normal distribution (mean = 0, sd = 1)
generated_data <- rnorm(1000, mean = 5, sd = 2)  # Different normal distribution
tv_distance <- calculate_tv_distance_empirical(original_data, generated_data)
print(tv_distance)  # Expected to be larger, as the datasets are quite different


covalchemy documentation built on April 12, 2025, 2:15 a.m.