View source: R/robust_distances.R
| robust_distances | R Documentation |
Computes a weighted, robust squared distance matrix for datasets
containing continuous, binary, and categorical variables. Continuous
variables are handled via a robust Mahalanobis distance, and binary
and categorical variables are transformed via similarity coefficients.
The output is suitable for Euclidean correction with make_euclidean.
robust_distances(
data = NULL,
cont_vars = NULL,
bin_vars = NULL,
cat_vars = NULL,
w = NULL,
p = NULL,
method = c("ggower", "relms"),
robust_cov = NULL,
alpha = 0.1,
return_dist = FALSE
)
data |
Data frame or numeric matrix containing the observations. |
cont_vars |
Character vector of column names for continuous variables. |
bin_vars |
Character vector of column names for binary variables. |
cat_vars |
Character vector of column names for categorical variables. |
w |
Numeric vector of observation weights. If NULL, uniform weights are used. |
p |
Integer vector of length 3: |
method |
Character string: either |
robust_cov |
Optional. Precomputed robust covariance matrix for continuous variables.
If NULL, it will be estimated internally using the specified trimming proportion |
alpha |
Numeric trimming proportion for robust covariance of continuous variables. |
return_dist |
Logical. If TRUE, returns an object of class |
A numeric matrix of squared robust distances (n x n) or a dist object if return_dist = TRUE.
# Example: Robust Squared Distances for Mixed Data
# Load example data and subset
data("Data_HC_contamination", package = "dbrobust")
Data_small <- Data_HC_contamination[1:50, ]
# Define variable types
cont_vars <- c("V1", "V2", "V3", "V4") # continuous
cat_vars <- c("V5", "V6", "V7") # categorical
bin_vars <- c("V8", "V9") # binary
# Use column w_loop as weights
w <- Data_small$w_loop
# -------------------------------
# Method 1: Gower distances
# -------------------------------
dist_sq_ggower <- robust_distances(
data = Data_small,
cont_vars = cont_vars,
bin_vars = bin_vars,
cat_vars = cat_vars,
w = w,
alpha = 0.10,
method = "ggower"
)
# Apply Euclidean correction if needed
res_ggower <- make_euclidean(dist_sq_ggower, w)
# Show first 5x5 block of original and corrected distances
cat("GGower original squared distances (5x5 block):\n")
print(round(dist_sq_ggower[1:5, 1:5], 4))
cat("\nGGower corrected squared distances (5x5 block):\n")
print(round(res_ggower$D_euc[1:5, 1:5], 4))
# -------------------------------
# Method 2: RelMS distances
# -------------------------------
dist_sq_relms <- robust_distances(
data = Data_small,
cont_vars = cont_vars,
bin_vars = bin_vars,
cat_vars = cat_vars,
w = w,
alpha = 0.10,
method = "relms"
)
# Apply Euclidean correction if needed
res_relms <- make_euclidean(dist_sq_relms, w)
# Show first 5x5 block of original and corrected distances
cat("RelMS original squared distances (5x5 block):\n")
print(round(dist_sq_relms[1:5, 1:5], 4))
cat("\nRelMS corrected squared distances (5x5 block):\n")
print(round(res_relms$D_euc[1:5, 1:5], 4))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.