View source: R/dissimilarity.R
| dissimilarity | R Documentation |
Computes dissimilarity matrices between observations using various methods. This is the main interface for dissimilarity computation in the resemble package.
dissimilarity(Xr, Xu = NULL, diss_method = diss_pca(), Yr = NULL)
Xr |
A numeric matrix of reference observations (rows) and variables (columns). |
Xu |
Optional matrix of additional observations with the same variables. |
diss_method |
A dissimilarity method object created by one of:
Default is |
Yr |
Optional response matrix. Required for PLS methods and when using
|
The function dispatches to the appropriate internal computation based on the
class of diss_method. Each method constructor (e.g., diss_pca())
encapsulates all method-specific parameters including component selection,
centering, scaling, and whether to return projections.
When only Xr is provided, the function computes pairwise dissimilarities
among all observations in Xr, returning a symmetric
nrow(Xr) \mjeqn\timesx nrow(Xr) matrix.
When both Xr and Xu are provided, the function computes
dissimilarities between each observation in Xr and each observation
in Xu, returning a nrow(Xr) \mjeqn\timesx nrow(Xu)
matrix where element \mjeqn(i, j)(i, j) is the dissimilarity between the
\mjeqnii-th observation in Xr and the \mjeqnjj-th observation
in Xu.
Note that diss_mahalanobis() computes Mahalanobis distance directly on
the input variables. This requires the covariance matrix to be invertible,
which fails when the number of variables exceeds the number of observations
or when variables are highly correlated (common in spectral data). For such
cases, use diss_pca() or diss_pls() instead.
A list of class "dissimilarity" containing:
The computed dissimilarity matrix. Dimensions are
nrow(Xr) \mjeqn\timesx nrow(Xr) when Xu = NULL,
or nrow(Xr) \mjeqn\timesx nrow(Xu) otherwise.
The diss_* constructor object used for computation.
Vector used to center the data.
Vector used to scale the data.
Number of components used (for projection methods).
If return_projection = TRUE in the method
constructor, the ortho_projection object.
Ramirez-Lopez, L., Behrens, T., Schmidt, K., Stevens, A., Dematte, J.A.M., Scholten, T. 2013a. The spectrum-based learner: A new local approach for modeling soil vis-NIR spectra of complex data sets. Geoderma 195-196, 268-279.
Ramirez-Lopez, L., Behrens, T., Schmidt, K., Viscarra Rossel, R., Dematte, J.A.M., Scholten, T. 2013b. Distance and similarity-search metrics for use with soil vis-NIR spectra. Geoderma 199, 43-53.
diss_pca, diss_pls,
diss_correlation, diss_euclidean,
diss_mahalanobis, diss_cosine
library(prospectr)
data(NIRsoil)
# Preprocess
sg <- savitzkyGolay(NIRsoil$spc, m = 1, p = 4, w = 15)
Xr <- sg[as.logical(NIRsoil$train), ]
Xu <- sg[!as.logical(NIRsoil$train), ]
Yr <- NIRsoil$CEC[as.logical(NIRsoil$train)]
Yu <- NIRsoil$CEC[!as.logical(NIRsoil$train)]
Xu <- Xu[!is.na(Yu), ]
Xr <- Xr[!is.na(Yr), ]
Yr <- Yr[!is.na(Yr)]
# PCA-based dissimilarity with variance-based selection
d1 <- dissimilarity(Xr, Xu, diss_method = diss_pca())
# PCA with OPC selection (requires Yr)
d2 <- dissimilarity(Xr, Xu,
Yr = Yr,
diss_method = diss_pca(
ncomp = ncomp_by_opc(30),
return_projection = TRUE
)
)
# PLS-based dissimilarity
d3 <- dissimilarity(
Xr, Xu,
Yr = Yr,
diss_method = diss_pls(
ncomp = ncomp_by_opc(30)
)
)
# Euclidean distance
d4 <- dissimilarity(Xr, Xu, diss_method = diss_euclidean())
# Correlation dissimilarity with moving window
d5 <- dissimilarity(Xr, Xu, diss_method = diss_correlation(ws = 41))
# Mahalanobis distance (use only when n > p and low collinearity)
# d6 <- dissimilarity(Xr[, 1:20], Xu[, 1:20],
# diss_method = diss_mahalanobis())
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.