make_euclidean: Force a Pairwise Squared Distance Matrix to Euclidean Form

View source: R/make_euclidean.R

make_euclideanR Documentation

Force a Pairwise Squared Distance Matrix to Euclidean Form

Description

Given a pairwise squared distance matrix D (where D[i,j] = d(i,j)^2), this function ensures that D corresponds to a valid Euclidean squared distance matrix. The correction is based on the weighted Gram matrix G_w = -\frac{1}{2} J_w D J_w^\top, where J_w = I_n - \mathbf{1} w^\top is the centering matrix defined by the weight vector w.

Usage

make_euclidean(D, w, tol = 1e-10)

Arguments

D

Numeric square matrix (n x n) of pairwise squared distances. Must be symmetric with zeros on the diagonal.

w

Numeric vector of weights (length n). Internally normalized to sum to 1.

tol

Numeric tolerance for detecting negative eigenvalues (default: 1e-10).

Details

If the smallest eigenvalue \lambda_{\min} of G_w is below the negative tolerance -tol, the function corrects D by adding a constant shift to guarantee positive semi-definiteness of the Gram matrix, following the approach of \insertCitelingoes1971somedbrobust and \insertCitemardia1978somedbrobust:

D_{\text{new}} = D + 2 c \mathbf{1} \mathbf{1}^\top - 2 c I_n,

where c = |\lambda_{\min}|.

Value

A list with components:

D_euc

Corrected pairwise squared Euclidean distance matrix (n x n).

eigvals_before

Eigenvalues of the weighted Gram matrix before correction.

eigvals_after

Eigenvalues of the weighted Gram matrix after correction.

transformed

Logical, TRUE if correction was applied, FALSE otherwise.

References

\insertRef

lingoes1971somedbrobust \insertRefmardia1978somedbrobust

See Also

dist, eigen, cmdscale

Examples

# Load example dataset
data("Data_HC_contamination")

# Reduce dataset to first 50 rows
Data_small <- Data_HC_contamination[1:50, ]

# Select only continuous variables
cont_vars <- names(Data_small)[1:4]
Data_cont <- Data_small[, cont_vars]

# Compute squared Euclidean distance matrix
dist_mat <- as.matrix(dist(Data_cont))^2

# Introduce a small non-Euclidean distortion
dist_mat[1, 2] <- dist_mat[1, 2] * 0.5
dist_mat[2, 1] <- dist_mat[1, 2]

# Uniform weights
weights <- rep(1, nrow(Data_cont))

# Apply Euclidean correction
res <- make_euclidean(dist_mat, weights)

# Check results (minimum eigenvalues before/after)
res$transformed
min(res$eigvals_before)
min(res$eigvals_after)

# First 5x5 block of corrected matrix
round(res$D_euc[1:5, 1:5], 4)


dbrobust documentation built on Nov. 5, 2025, 6:24 p.m.