View source: R/search_neighbors.R
| search_neighbors | R Documentation |
This function searches in a reference set the neighbors of the observations provided in another set.
search_neighbors(Xr, Xu, diss_method = c("pca", "pca.nipals", "pls", "mpls",
"cor", "euclid", "cosine", "sid"),
Yr = NULL, k, k_diss, k_range, spike = NULL,
pc_selection = list("var", 0.01),
return_projection = FALSE, return_dissimilarity = FALSE,
ws = NULL,
center = TRUE, scale = FALSE,
documentation = character(), ...)
Xr |
a matrix of reference (spectral) observations where the neighbor search is to be conducted. See details. |
Xu |
an optional matrix of (spectral) observations for which its
neighbors are to be searched in |
diss_method |
a character string indicating the spectral dissimilarity metric to be used in the selection of the nearest neighbors of each observation.
|
Yr |
a numeric matrix of
|
k |
an integer value indicating the k-nearest neighbors of each
observation in |
k_diss |
an integer value indicating a dissimilarity treshold.
For each observation in |
k_range |
an integer vector of length 2 which specifies the minimum
(first value) and the maximum (second value) number of neighbors to be
retained when the |
spike |
a vector of integers (with positive and/or negative values)
indicating what observations in |
pc_selection |
a list of length 2 to be passed onto the
The default is Optionally, the |
return_projection |
a logical indicating if the projection(s) must be
returned. Projections are used if the |
return_dissimilarity |
a logical indicating if the dissimilarity matrix used for neighbor search must be returned. |
ws |
an odd integer value which specifies the window size, when
|
center |
a logical indicating if the |
scale |
a logical indicating if the |
documentation |
an optional character string that can be used to
describe anything related to the |
... |
further arguments to be passed to the |
This function may be specially useful when the reference set (Xr) is
very large. In some cases the number of observations in the reference set
can be reduced by removing irrelevant observations (i.e. observations that are not
neighbors of a particular target set). For example, this fucntion can be
used to reduce the size of the reference set before before running the
mbl function.
This function uses the dissimilarity fucntion to compute the
dissimilarities between Xr and Xu. Arguments to
dissimilarity as well as further arguments to the functions
used inside dissimilarity (i.e. ortho_diss
cor_diss f_diss sid) can be passed to
those functions as additional arguments (i.e. ...).
If no matrix is passed to Xu, the neighbor search is conducted for the
observations in Xr that are found whiting that matrix. If a matrix is
passed to Xu, the neighbors of Xu are searched in the Xr
matrix.
a list containing the following elements:
neighbors_diss: a matrix of the Xr dissimilarity scores
corresponding to the neighbors of each Xr observation (or Xu
observation, in case Xu was supplied).
The neighbor dissimilarity scores are organized by columns and are sorted
in ascending order.
neighbors: a matrix of the Xr indices corresponding to
the neighbors of each observation in Xu. The neighbor indices are
organized by columns and are sorted in ascending order by their
dissimilarity score.
unique_neighbors: a vector of the indices in Xr
identified as neighbors of any observation in Xr (or in Xu,
in case it was supplied). This is obtained by
converting the neighbors matrix into a vector and applying the
unique function.
k_diss_info: a data.table that is returned only if the
k_diss argument was used. It comprises three columns, the first one
(Xr_index or Xu_index) indicates the index of the observations
in Xr (or in Xu, in case it was suppplied),
the second column (n_k) indicates the number of neighbors found in
Xr and the third column (final_n_k) indicates the final number
of neighbors selected bounded by k_range.
argument.
dissimilarity: If return_dissimilarity = TRUE the
dissimilarity object used (as computed by the dissimilarity
function.
projection: an ortho_projection object. Only output if
return_projection = TRUE and if diss_method = "pca",
diss_method = "pca.nipals" or diss_method = "pls".
This object contains the projection used to compute
the dissimilarity matrix. In case of local dissimilarity matrices,
the projection corresponds to the global projection used to select the
neighborhoods. (see ortho_diss function for further
details).
Ramirez-Lopez, L., Behrens, T., Schmidt, K., Stevens, A., Dematte, J.A.M., Scholten, T. 2013a. The spectrum-based learner: A new local approach for modeling soil vis-NIR spectra of complex data sets. Geoderma 195-196, 268-279.
Ramirez-Lopez, L., Behrens, T., Schmidt, K., Viscarra Rossel, R., Dematte, J. A. M., Scholten, T. 2013b. Distance and similarity-search metrics for use with soil vis-NIR spectra. Geoderma 199, 43-53.
dissimilarity ortho_diss
cor_diss f_diss sid
mbl
library(prospectr)
data(NIRsoil)
Xu <- NIRsoil$spc[!as.logical(NIRsoil$train), ]
Yu <- NIRsoil$CEC[!as.logical(NIRsoil$train)]
Yr <- NIRsoil$CEC[as.logical(NIRsoil$train)]
Xr <- NIRsoil$spc[as.logical(NIRsoil$train), ]
Xu <- Xu[!is.na(Yu), ]
Yu <- Yu[!is.na(Yu)]
Xr <- Xr[!is.na(Yr), ]
Yr <- Yr[!is.na(Yr)]
# Identify the neighbor observations using the correlation dissimilarity and
# default parameters
# (In this example all the observations in Xr belong at least to the
# first 100 neighbors of one observation in Xu)
ex1 <- search_neighbors(
Xr = Xr, Xu = Xu,
diss_method = "cor",
k = 40
)
# Identify the neighbor observations using principal component (PC)
# and partial least squares (PLS) dissimilarities, and using the "opc"
# approach for selecting the number of components
ex2 <- search_neighbors(
Xr = Xr, Xu = Xu,
diss_method = "pca",
Yr = Yr, k = 50,
pc_selection = list("opc", 40),
scale = TRUE
)
# Observations that do not belong to any neighborhood
seq(1, nrow(Xr))[!seq(1, nrow(Xr)) %in% ex2$unique_neighbors]
ex3 <- search_neighbors(
Xr = Xr, Xu = Xu,
diss_method = "pls",
Yr = Yr, k = 50,
pc_selection = list("opc", 40),
scale = TRUE
)
# Observations that do not belong to any neighborhood
seq(1, nrow(Xr))[!seq(1, nrow(Xr)) %in% ex3$unique_neighbors]
# Identify the neighbor observations using local PC dissimialrities
# Here, 150 neighbors are used to compute a local dissimilarity matrix
# and then this matrix is used to select 50 neighbors
ex4 <- search_neighbors(
Xr = Xr, Xu = Xu,
diss_method = "pls",
Yr = Yr, k = 50,
pc_selection = list("opc", 40),
scale = TRUE,
.local = TRUE,
pre_k = 150
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.