View source: R/adaptive_sampling.R
initial_parameter_optimization | R Documentation |
Performs parameter optimization using Latin Hypercube Sampling (LHS) combined with k-fold cross-validation. Parameters are sampled from specified ranges using maximin LHS design to ensure good coverage of parameter space. Each parameter set is evaluated using k-fold cross-validation to assess prediction accuracy. To calculate one NLL per set of parameters, the function uses a pooled errors approach which combine all validation errors into one set, then calculate a single NLL. This approach has two main advantages: 1- It treats all validation errors equally, respecting the underlying error distribution assumption 2- It properly accounts for the total number of validation points
Note: As of version 2.0.0, this function returns log-transformed parameters directly,
eliminating the need to call log_transform_parameters()
separately.
initial_parameter_optimization(
dissimilarity_matrix,
mapping_max_iter = 1000,
relative_epsilon,
convergence_counter,
scenario_name,
N_min,
N_max,
k0_min,
k0_max,
c_repulsion_min,
c_repulsion_max,
cooling_rate_min,
cooling_rate_max,
num_samples = 20,
max_cores = NULL,
folds = 20,
verbose = FALSE,
write_files = FALSE,
output_dir
)
dissimilarity_matrix |
Matrix. Input dissimilarity matrix. Must be square and symmetric. |
mapping_max_iter |
Integer. Maximum number of optimization iterations for each map. |
relative_epsilon |
Numeric. Convergence threshold for relative change in error. |
convergence_counter |
Integer. Number of iterations below threshold before declaring convergence. |
scenario_name |
Character. Name for output files and job identification. |
N_min , N_max |
Integer. Range for the number of dimensions parameter. |
k0_min , k0_max |
Numeric. Range for the initial spring constant parameter. |
c_repulsion_min , c_repulsion_max |
Numeric. Range for the repulsion constant parameter. |
cooling_rate_min , cooling_rate_max |
Numeric. Range for the cooling rate parameter. |
num_samples |
Integer. Number of LHS samples to generate. Default: 20. |
max_cores |
Integer. Maximum number of cores for parallel processing. Default: NULL (uses all but one). |
folds |
Integer. Number of cross-validation folds. Default: 20. |
verbose |
Logical. Whether to print progress messages. Default: FALSE. |
write_files |
Logical. Whether to save results to a CSV file. Default: FALSE. |
output_dir |
Character. Directory for output files. Required if |
Initial Parameter Optimization using Latin Hypercube Sampling
The function performs these steps:
Generates LHS samples in the parameter space (original scale for sampling).
Creates k-fold splits of the input data.
For each parameter set, it trains the model on each fold's training set and evaluates on the validation set, calculating a pooled MAE and NLL across all folds.
Computations are run locally in parallel.
NEW: Automatically log-transforms the final results for direct use with adaptive sampling.
A data.frame
containing the log-transformed parameter sets and their performance metrics.
Columns include: log_N
, log_k0
, log_cooling_rate
, log_c_repulsion
, Holdout_MAE
, and NLL
.
Breaking Change in v2.0.0: This function now returns log-transformed parameters directly.
The returned data frame has columns log_N
, log_k0
, log_cooling_rate
, log_c_repulsion
instead of the original scale parameters. This eliminates the need to call log_transform_parameters()
separately before using run_adaptive_sampling()
.
Breaking Change in v2.0.0: The parameter distance_matrix
has been renamed to
dissimilarity_matrix
. Please update your code accordingly.
euclidean_embedding
for the core optimization algorithm.
# This example can exceed 5 seconds on some systems.
# 1. Create a simple synthetic dataset for the example
synth_coords <- matrix(rnorm(60), nrow = 20, ncol = 3)
dist_mat <- coordinates_to_matrix(synth_coords)
# 2. Run the optimization on the synthetic data
results <- initial_parameter_optimization(
dissimilarity_matrix = dist_mat,
mapping_max_iter = 100,
relative_epsilon = 1e-3,
convergence_counter = 2,
scenario_name = "test_opt_synthetic",
N_min = 2, N_max = 5,
k0_min = 1, k0_max = 10,
c_repulsion_min = 0.001, c_repulsion_max = 0.05,
cooling_rate_min = 0.001, cooling_rate_max = 0.02,
num_samples = 4,
max_cores = 1, # Avoid parallel processing in check environment
verbose = FALSE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.