knnr | R Documentation |

Functions `knnr`

and `knnda`

build KNN (eventually locally weighted) regression and discrimination models, respectively, for an univariate response `y`

.

The functions use functions `getknn`

and `locw`

. See the code for details

For each new observation to predict, the principle of KNN regression models (R and DA) is to select a number of `k`

nearest neighbors and to calculate the prediction by the average of the response `y`

(for regression) or the most frequent class in `y`

(for discrimination) over this neighborhood. The KNN selection step is referred to as `weighting "1"`

in `locw`

. In standard KNN regression models, the statistical weight of each of the `k`

neighbors is `1/k`

. In locally weighted KNN regression models, the statistical weights of the neighbors depend from the dissimilarities (preliminary calculated) between the observation to predict and the `k`

neighbors. This step is referred to as `weighting "2"`

in `locw`

.

In `knnr`

and `knnda`

, the dissimilarities can be calculated from the original (i.e. not compressed) data or from preliminary computed global PLS scores.

```
knnr(
Xr, Yr,
Xu, Yu = NULL,
ncompdis = NULL, diss = c("euclidean", "mahalanobis", "correlation"),
h = Inf, k,
stor = TRUE,
print = TRUE,
...
)
knnda(
Xr, Yr,
Xu, Yu = NULL,
ncompdis = NULL, diss = c("euclidean", "mahalanobis", "correlation"),
h = Inf, k,
stor = TRUE,
print = TRUE,
...
)
```

`Xr` |
A |

`Yr` |
A vector of length |

`Xu` |
A |

`Yu` |
A vector of length |

`diss` |
The type of dissimilarity used for defining the neighbors. Possible values are "euclidean" (default; Euclidean distance), "mahalanobis" (Mahalanobis distance), or "correlation". Correlation dissimilarities are calculated by sqrt(.5 * (1 - rho)). |

`ncompdis` |
A vector (eventually of length = 1) defining the number(s) of components of the preliminary global PLS calculated on |

`h` |
A vector (eventually of length = 1) defining the scaling shape factor(s) of the function of the weights applied to the neighbors in the weighted PLSR. Lower is |

`k` |
A vector (eventually of length = 1) defining the number(s) of nearest neighbors to select in the reference data set for each observation to predict. Each component of |

`stor` |
Logical (default to |

`print` |
Logical (default = |

`...` |
Optionnal arguments to pass in function |

A list of outputs (see examples), such as:

`y` |
Responses for the test data. |

`fit` |
Predictions for the test data. |

`r` |
Residuals for the test data. |

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

```
data(datcass)
data(datforages)
######################## knnr
Xr <- datcass$Xr
yr <- datcass$yr
Xu <- datcass$Xu
yu <- datcass$yu
Xr <- detrend(Xr)
Xu <- detrend(Xu)
headm(Xr)
headm(Xu)
## A KNN-WR model where:
## The dissimilarities between the observations are defined
## by the Mahalanobis distances calculated from a global PLS score space
## of ncompdis = 10 components.
## - Weighting "1" = knn selection of k = {5, 10, 15} neighbors
## - Weighting "2" = within each neighborhood, weights are calculated by "wdist"
ncompdis <- 10
h <- c(1, 2)
k <- seq(5, 20, by = 5)
fm <- knnr(
Xr, yr,
Xu, yu,
ncompdis = ncompdis, diss = "mahalanobis",
h = h, k = k,
print = TRUE
)
names(fm)
head(fm$y)
head(fm$fit)
head(fm$r)
z <- mse(fm, ~ ncompdis + h + k)
z
z[z$rmsep == min(z$rmsep), ]
group <- paste("ncompdis=", z$ncompdis, ", h=", z$h, sep = "")
plotxy(z[, c("k", "rmsep")], asp = 0, group = group, pch = 16)
## Same but where :
## The dissimilarities between the observations are defined
## by Euclidean distances calculated from the original (i.e. not compressed) X data
ncompdis <- NULL
h <- c(1, 2)
k <- seq(5, 20, by = 5)
fm <- knnr(
Xr, yr,
Xu, yu,
ncompdis = ncompdis, diss = "euclidean",
h = h, k = k,
print = TRUE
)
z <- mse(fm, ~ ncompdis + h + k)
z
z[z$rmsep == min(z$rmsep), ]
group <- paste("ncompdis=", z$ncompdis, ", h=", z$h, sep = "")
plotxy(z[, c("k", "rmsep")], asp = 0, group = group, pch = 16)
######################## knnda
Xr <- datforages$Xr
yr <- datforages$yr
Xu <- datforages$Xu
yu <- datforages$yu
Xr <- savgol(snv(Xr), n = 21, p = 2, m = 2)
Xu <- savgol(snv(Xu), n = 21, p = 2, m = 2)
headm(Xr)
headm(Xu)
table(yr)
table(yu)
## A knnDA model where:
## The dissimilarities between the observations are defined
## by the Mahalanobis distances calculated from a global PLS score space
## of ncompdis = 10 components.
## - Weighting "1" = knn selection of k = {5, 10, 15} neighbors
## - Weighting "2" = within each neighborhood, weights are calculated by "wdist"
ncompdis <- 10
h <- c(1, 2)
k <- seq(5, 15, by = 5)
fm <- knnda(
Xr, yr,
Xu, yu,
ncompdis = ncompdis, diss = "mahalanobis",
h = h, k = k,
print = TRUE
)
names(fm)
headm(fm$y)
headm(fm$fit)
headm(fm$r)
z <- err(fm, ~ ncompdis + h + k)
z
z[z$err == min(z$errp), ]
group <- paste("ncompdis=", z$ncompdis, ", h=", z$h, sep = "")
plotxy(z[, c("k", "errp")], asp = 0, group = group, pch = 16)
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.