build_K: build_k
In mrecos/klrfome: Kernel Logistic Regression with Focal Mean Embeddings

build_K

R Documentation

build_k

Description

'build_k()' is a primary package function that takes in the formatted list of site/background data and builds a similarity matrix suitable for computation with the 'KLR()' function or direct study.

Usage

build_K(y1, y2 = y1, sigma, progress = TRUE, dist_metric = "euclidean")

Arguments

`y1`	- [list] List of site/background data formatted by 'format_site_data()'
`y2`	- [list] Typically left blank as y2 == y1.
`sigma`	- [scaler] smoothing hyperparameters for RBF kernel
`progress`	- [logical] False = no progress bar; 1 = show progress bar
`dist_metric`	[character] One of the distance methods from rdist::cdist. Default = "euclidean". see ?rdist::cdist

Details

This function takes list of training data, scalar value for 'sigma' hyperparameter, and a distance method to compute a mean embedding similarity kernel. This kernel is a pair-wise (N x N) matrix of the mean similarity between the attributes describing each site location and background group. Optional inouts are 'progress' for a progress bar and 'dist_metric' for the distance computation. By default, the distance metric is euclidean and should likely stay as such unless you have explored other distances and know why/how you want to use them.

Value

- matrix K

Examples

## Not run: 
sim_data <- get_sim_data(site_samples = 800, N_site_bags = 75,
sites_var1_mean = 80, sites_var1_sd   = 10,
sites_var2_mean = 5,  sites_var2_sd   = 2,
backg_var1_mean = 100,backg_var1_sd   = 20,
backg_var2_mean = 6,  backg_var2_sd   = 3)
formatted_data <- format_site_data(sim_data, N_sites=10, train_test_split=0.8,
                                   sample_fraction = 0.9, background_site_balance=1)
train_data <- formatted_data[["train_data"]]
train_presence <- formatted_data[["train_presence"]]
test_presence <- formatted_data[["test_presence"]]

##### Logistic Mean Embedding KLR Model
#### Build Kernel Matrix
K <- build_K(train_data, sigma = sigma, dist_metric = dist_metric)
#### Train
train_log_pred <- KLR(K, train_presence, lambda, 100, 0.001, verbose = 2)
#### Predict
test_log_pred <- KLR_predict(test_data, train_data, dist_metric = dist_metric,
                            train_log_pred[["alphas"]], sigma)

## End(Not run)

mrecos/klrfome documentation built on April 6, 2022, 8:02 p.m.