Euclidify | R Documentation |
A user-friendly wrapper function that automatically optimizes parameters and performs Euclidean embedding on a dissimilarity matrix. This function handles the entire workflow from parameter optimization to final embedding.
Euclidify(
dissimilarity_matrix,
output_dir,
ndim_range = c(2, 10),
k0_range = c(0.1, 20),
cooling_rate_range = c(1e-04, 0.1),
c_repulsion_range = c(1e-04, 1),
n_initial_samples = 50,
n_adaptive_samples = 150,
max_cores = NULL,
folds = 20,
mapping_max_iter = 500,
clean_intermediate = TRUE,
verbose = "standard",
fallback_to_defaults = FALSE,
save_results = FALSE
)
dissimilarity_matrix |
Square symmetric dissimilarity matrix. Can contain NA values for missing measurements and threshold indicators (< or >). |
output_dir |
Character. Directory for saving optimization files and results. Required - no default. |
ndim_range |
Integer vector of length 2. Range for number of dimensions (minimum, maximum). Default: c(2, 10) |
k0_range |
Numeric vector of length 2. Range for initial spring constant (minimum, maximum). Default: c(0.1, 15) |
cooling_rate_range |
Numeric vector of length 2. Range for cooling rate (minimum, maximum). Default: c(0.001, 0.07) |
c_repulsion_range |
Numeric vector of length 2. Range for repulsion constant (minimum, maximum). Default: c(0.001, 0.4) |
n_initial_samples |
Integer. Number of samples for initial parameter optimization. Default: 100 |
n_adaptive_samples |
Integer. Number of samples for adaptive refinement. Default: 250 |
max_cores |
Integer. Maximum number of cores to use. Default: NULL (auto-detect) |
folds |
Integer. Number of cross-validation folds. Default: 20 |
mapping_max_iter |
Integer. Maximum iterations for final embedding. Half this value is used for parameter search. Default: 1000 |
clean_intermediate |
Logical. Whether to remove intermediate files. Default: TRUE |
verbose |
Character. Verbosity level: "off" (no output), "standard" (progress updates), or "full" (detailed output including from internal functions). Default: "standard" |
fallback_to_defaults |
Logical. Whether to use default parameters if optimization fails. Default: TRUE |
save_results |
Logical. Whether to save the final positions as CSV. Default: FALSE |
A list containing:
positions |
Matrix of optimized coordinates |
est_distances |
Matrix of estimated distances |
mae |
Mean absolute error |
optimal_params |
List of optimal parameters found, including cross-validation MAE during optimization |
optimization_summary |
Summary of the optimization process |
data_characteristics |
Summary of input data characteristics |
runtime |
Total runtime in seconds |
# Example 1: Basic usage with small matrix
test_data <- data.frame(
object = rep(paste0("Obj", 1:4), each = 4),
reference = rep(paste0("Ref", 1:4), 4),
score = sample(c(1, 2, 4, 8, 16, 32, 64, "<1", ">12"), 16, replace = TRUE)
)
dist_mat <- list_to_matrix(
data = test_data, # Pass the data frame, not file path
object_col = "object",
reference_col = "reference",
value_col = "score",
is_similarity = TRUE
)
## Not run:
# Note: output_dir is required for actual use
result <- Euclidify(
dissimilarity_matrix = dist_mat,
output_dir = tempdir() # Use temp directory for example
)
coordinates <- result$positions
## End(Not run)
# Example 2: Using custom parameter ranges
## Not run:
result <- Euclidify(
dissimilarity_matrix = dist_mat,
output_dir = tempdir(),
n_initial_samples = 10,
n_adaptive_samples = 7,
verbose = "off"
)
## End(Not run)
# Example 3: Handling missing data
dist_mat_missing <- dist_mat
dist_mat_missing[1, 3] <- dist_mat_missing[3, 1] <- NA
## Not run:
result <- Euclidify(
dissimilarity_matrix = dist_mat_missing,
output_dir = tempdir(),
n_initial_samples = 10,
n_adaptive_samples = 7,
verbose = "off"
)
## End(Not run)
# Example 4: Using threshold indicators
dist_mat_threshold <- dist_mat
dist_mat_threshold[1, 2] <- ">2"
dist_mat_threshold[2, 1] <- ">2"
## Not run:
result <- Euclidify(
dissimilarity_matrix = dist_mat_threshold,
output_dir = tempdir(),
n_initial_samples = 10,
n_adaptive_samples = 7,
verbose = "off"
)
## End(Not run)
# Example 5: Parallel processing with custom cores
## Not run:
result <- Euclidify(
dissimilarity_matrix = dist_mat,
output_dir = tempdir(),
max_cores = 4,
n_adaptive_samples = 100,
save_results = TRUE # Save positions to CSV
)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.