naes | R Documentation |
Perform a k-means sampling on a matrix for multivariate calibration
naes(X, k, pc, iter.max = 10, method = 0, .center = TRUE, .scale = FALSE)
X |
a numeric matrix (optionally a data frame that can be coerced to a numerical matrix). |
k |
either the number of calibration samples to select or a set of cluster centres to initiate the k-means clustering. |
pc |
optional. If not specified, k-means is run directly on the variable
(Euclidean) space.
Alternatively, a PCA is performed before k-means and |
iter.max |
maximum number of iterations allowed for the k-means
clustering. Default is |
method |
the method used for selecting calibration samples within each
cluster: either samples closest to the cluster.
centers ( |
.center |
logical value indicating whether the input matrix must be
centered before Principal Component Analysis. Default set to |
.scale |
logical value indicating whether the input matrix must be
scaled before Principal Component Analysis. Default set to |
K-means sampling is a simple procedure based on cluster analysis to select calibration samples from large multivariate datasets. The method can be described in three points (Naes et al.,2001):
Perform a PCA and decide how many principal component to keep,
Carry out a k-means clustering on the principal component scores and choose the number of resulting clusters to be equal to the number of desired calibration samples,
Select one sample from each cluster.
a list with components:
'model
': numeric vector giving the row indices of the input data
selected for calibration
'test
': numeric vector giving the row indices of the remaining
observations
'pc
': if the pc
argument is specified, a numeric matrix of the
scaled pc scores
'cluster
': integer vector indicating the cluster to which each
point was assigned
'centers
': a matrix of cluster centres
Antoine Stevens & Leonardo Ramirez-Lopez
Naes, T., 1987. The design of calibration in near infra-red reflectance analysis by clustering. Journal of Chemometrics 1, 121-134.
Naes, T., Isaksson, T., Fearn, T., and Davies, T., 2002. A user friendly guide to multivariate calibration and classification. NIR Publications, Chichester, United Kingdom.
kenStone
, honigs
, duplex
,
shenkWest
data(NIRsoil)
sel <- naes(NIRsoil$spc, k = 5, p = .99, method = 0)
# clusters
plot(sel$pc[, 1:2], col = sel$cluster + 2)
# points selected for calibration with method = 0
points(sel$pc[sel$model, 1:2],
col = 2,
pch = 19,
cex = 1
)
# pre-defined centers can also be provided
sel2 <- naes(NIRsoil$spc,
k = sel$centers,
p = .99, method = 1
)
# points selected for calibration with method = 1
points(sel$pc[sel2$model, 1:2],
col = 1,
pch = 15,
cex = 1
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.