View source: R/search_neighbors.R
| search_neighbors | R Documentation |
Searches for the nearest neighbors of observations in a reference set or between two sets of observations.
search_neighbors(Xr, Xu = NULL,
diss_method = diss_pca(), Yr = NULL,
neighbors, spike = NULL,
return_dissimilarity = FALSE,
k, k_diss, k_range, pc_selection,
center, scale, documentation, ...
)
Xr |
A numeric matrix of reference observations (rows) and variables (columns) where the neighbor search is conducted. |
Xu |
Optional matrix of observations for which neighbors are to be
searched in |
diss_method |
A dissimilarity method object created by one of:
Default is |
Yr |
Optional response matrix. Required for PLS methods and when using
|
neighbors |
A neighbor selection object created by:
|
spike |
Optional integer vector indicating observations in |
return_dissimilarity |
Logical indicating whether to return the
dissimilarity matrix. Default is |
k |
Deprecated. |
k_diss |
Deprecated. |
k_range |
Deprecated. |
pc_selection |
Deprecated. |
center |
Deprecated. |
scale |
Deprecated. |
documentation |
Deprecated. |
... |
Additional arguments (currently unused). |
This function is useful for reducing large reference sets by identifying
only relevant neighbors before running mbl.
If Xu is not provided, the function searches for neighbors within
Xr itself (excluding self-matches). If Xu is provided,
neighbors of each observation in Xu are searched in Xr.
The spike argument allows forcing specific observations into or out
of all neighborhoods. Positive indices are always included; negative indices
are always excluded.
A list containing:
Matrix of Xr indices for each query observation's
neighbors, sorted by dissimilarity (columns = query observations).
Matrix of dissimilarity scores corresponding to
neighbors.
Vector of unique Xr indices that appear
in any neighborhood.
If neighbors_diss() was used, a data.frame
with columns for observation index, number of neighbors found, and final
number after applying bounds.
If return_dissimilarity = TRUE, the full
dissimilarity object.
If the dissimilarity method includes
return_projection = TRUE, the projection object.
If the dissimilarity method includes gh = TRUE, the
GH distances.
Ramirez-Lopez, L., Behrens, T., Schmidt, K., Stevens, A., Dematte, J.A.M., Scholten, T. 2013a. The spectrum-based learner: A new local approach for modeling soil vis-NIR spectra of complex data sets. Geoderma 195-196, 268-279.
Ramirez-Lopez, L., Behrens, T., Schmidt, K., Viscarra Rossel, R., Dematte, J.A.M., Scholten, T. 2013b. Distance and similarity-search metrics for use with soil vis-NIR spectra. Geoderma 199, 43-53.
dissimilarity, mbl,
neighbors_k, neighbors_diss
library(prospectr)
data(NIRsoil)
Xu <- NIRsoil$spc[!as.logical(NIRsoil$train), ]
Yu <- NIRsoil$CEC[!as.logical(NIRsoil$train)]
Yr <- NIRsoil$CEC[as.logical(NIRsoil$train)]
Xr <- NIRsoil$spc[as.logical(NIRsoil$train), ]
Xu <- Xu[!is.na(Yu), ]
Yu <- Yu[!is.na(Yu)]
Xr <- Xr[!is.na(Yr), ]
Yr <- Yr[!is.na(Yr)]
# Correlation-based neighbor search with k neighbors
ex1 <- search_neighbors(
Xr = Xr, Xu = Xu,
diss_method = diss_correlation(),
neighbors = neighbors_k(40)
)
# PCA-based with OPC selection
ex2 <- search_neighbors(
Xr = Xr, Xu = Xu,
diss_method = diss_pca(
ncomp = ncomp_by_opc(40),
scale = TRUE,
return_projection = TRUE
),
Yr = Yr,
neighbors = neighbors_k(50)
)
# Observations not in any neighborhood
setdiff(seq_len(nrow(Xr)), ex2$unique_neighbors)
# Dissimilarity threshold-based selection
ex3 <- search_neighbors(
Xr = Xr, Xu = Xu,
diss_method = diss_pls(
ncomp = ncomp_by_opc(40),
scale = TRUE
),
Yr = Yr,
neighbors = neighbors_diss(threshold = 0.5, k_min = 10, k_max = 100)
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.