| knnr | R Documentation |
Functions knnr and knnda build KNN (eventually locally weighted) regression and discrimination models, respectively, for an univariate response y.
The functions use functions getknn and locw. See the code for details
For each new observation to predict, the principle of KNN regression models (R and DA) is to select a number of k nearest neighbors and to calculate the prediction by the average of the response y (for regression) or the most frequent class in y (for discrimination) over this neighborhood. The KNN selection step is referred to as weighting "1" in locw. In standard KNN regression models, the statistical weight of each of the k neighbors is 1/k. In locally weighted KNN regression models, the statistical weights of the neighbors depend from the dissimilarities (preliminary calculated) between the observation to predict and the k neighbors. This step is referred to as weighting "2" in locw.
In knnr and knnda, the dissimilarities can be calculated from the original (i.e. not compressed) data or from preliminary computed global PLS scores.
knnr(
Xr, Yr,
Xu, Yu = NULL,
ncompdis = NULL, diss = c("euclidean", "mahalanobis", "correlation"),
h = Inf, k,
stor = TRUE,
print = TRUE,
...
)
knnda(
Xr, Yr,
Xu, Yu = NULL,
ncompdis = NULL, diss = c("euclidean", "mahalanobis", "correlation"),
h = Inf, k,
stor = TRUE,
print = TRUE,
...
)
Xr |
A |
Yr |
A vector of length |
Xu |
A |
Yu |
A vector of length |
diss |
The type of dissimilarity used for defining the neighbors. Possible values are "euclidean" (default; Euclidean distance), "mahalanobis" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |
ncompdis |
A vector (eventually of length = 1) defining the number(s) of components of the preliminary global PLS calculated on |
h |
A vector (eventually of length = 1) defining the scaling shape factor(s) of the function of the weights applied to the neighbors in the weighted PLSR. Lower is |
k |
A vector (eventually of length = 1) defining the number(s) of nearest neighbors to select in the reference data set for each observation to predict. Each component of |
stor |
Logical (default to |
print |
Logical (default = |
... |
Optionnal arguments to pass in function |
A list of outputs (see examples), such as:
y |
Responses for the test data. |
fit |
Predictions for the test data. |
r |
Residuals for the test data. |
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
data(datcass)
data(datforages)
######################## knnr
Xr <- datcass$Xr
yr <- datcass$yr
Xu <- datcass$Xu
yu <- datcass$yu
Xr <- detrend(Xr)
Xu <- detrend(Xu)
headm(Xr)
headm(Xu)
## A KNN-WR model where:
## The dissimilarities between the observations are defined
## by the Mahalanobis distances calculated from a global PLS score space
## of ncompdis = 10 components.
## - Weighting "1" = knn selection of k = {5, 10, 15} neighbors
## - Weighting "2" = within each neighborhood, weights are calculated by "wdist"
ncompdis <- 10
h <- c(1, 2)
k <- seq(5, 20, by = 5)
fm <- knnr(
Xr, yr,
Xu, yu,
ncompdis = ncompdis, diss = "mahalanobis",
h = h, k = k,
print = TRUE
)
names(fm)
head(fm$y)
head(fm$fit)
head(fm$r)
z <- mse(fm, ~ ncompdis + h + k)
z
z[z$rmsep == min(z$rmsep), ]
group <- paste("ncompdis=", z$ncompdis, ", h=", z$h, sep = "")
plotxy(z[, c("k", "rmsep")], asp = 0, group = group, pch = 16)
## Same but where :
## The dissimilarities between the observations are defined
## by Euclidean distances calculated from the original (i.e. not compressed) X data
ncompdis <- NULL
h <- c(1, 2)
k <- seq(5, 20, by = 5)
fm <- knnr(
Xr, yr,
Xu, yu,
ncompdis = ncompdis, diss = "euclidean",
h = h, k = k,
print = TRUE
)
z <- mse(fm, ~ ncompdis + h + k)
z
z[z$rmsep == min(z$rmsep), ]
group <- paste("ncompdis=", z$ncompdis, ", h=", z$h, sep = "")
plotxy(z[, c("k", "rmsep")], asp = 0, group = group, pch = 16)
######################## knnda
Xr <- datforages$Xr
yr <- datforages$yr
Xu <- datforages$Xu
yu <- datforages$yu
Xr <- savgol(snv(Xr), n = 21, p = 2, m = 2)
Xu <- savgol(snv(Xu), n = 21, p = 2, m = 2)
headm(Xr)
headm(Xu)
table(yr)
table(yu)
## A knnDA model where:
## The dissimilarities between the observations are defined
## by the Mahalanobis distances calculated from a global PLS score space
## of ncompdis = 10 components.
## - Weighting "1" = knn selection of k = {5, 10, 15} neighbors
## - Weighting "2" = within each neighborhood, weights are calculated by "wdist"
ncompdis <- 10
h <- c(1, 2)
k <- seq(5, 15, by = 5)
fm <- knnda(
Xr, yr,
Xu, yu,
ncompdis = ncompdis, diss = "mahalanobis",
h = h, k = k,
print = TRUE
)
names(fm)
headm(fm$y)
headm(fm$fit)
headm(fm$r)
z <- err(fm, ~ ncompdis + h + k)
z
z[z$err == min(z$errp), ]
group <- paste("ncompdis=", z$ncompdis, ", h=", z$h, sep = "")
plotxy(z[, c("k", "errp")], asp = 0, group = group, pch = 16)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.