## Imputation of missing values in compositional data using knn methods

### Description

This function offers several k-nearest neighbor methods for the imputation of missing values in compositional data.

### Usage

``````impKNNa(
x,
method = "knn",
k = 3,
metric = "Aitchison",
agg = "median",
primitive = FALSE,
normknn = TRUE,
das = FALSE,
)
``````

### Arguments

 `x` data frame or matrix `method` method (at the moment, only “knn” can be used) `k` number of nearest neighbors chosen for imputation `metric` “Aichison” or “Euclidean” `agg` “median” or “mean”, for the aggregation of the nearest neighbors `primitive` if TRUE, a more enhanced search for the \$k\$-nearest neighbors is obtained (see details) `normknn` An adjustment of the imputed values is performed if TRUE `das` depricated. if TRUE, the definition of the Aitchison distance, based on simple logratios of the compositional part, is used (Aitchison, 2000) to calculate distances between observations. if FALSE, a version using the clr transformation is used. `adj` either ‘median’ (default) or ‘sum’ can be chosen for the adjustment of the nearest neighbors, see Hron et al., 2010.

### Details

The Aitchison `metric` should be chosen when dealing with compositional data, the Euclidean `metric` otherwise.

If `primitive` `==` FALSE, a sequential search for the `k`-nearest neighbors is applied for every missing value where all information corresponding to the non-missing cells plus the information in the variable to be imputed plus some additional information is available. If `primitive` `==` TRUE, a search of the `k`-nearest neighbors among observations is applied where in addition to the variable to be imputed any further cells are non-missing.

If `normknn` is TRUE (prefered option) the imputed cells from a nearest neighbor method are adjusted with special adjustment factors (more details can be found online (see the references)).

### Value

 `xOrig ` Original data frame or matrix `xImp ` Imputed data `w ` Amount of imputed values `wind ` Index of the missing values in the data `metric ` Metric used

Matthias Templ

### References

Aitchison, J., Barcelo-Vidal, C., Martin-Fernandez, J.A., Pawlowsky-Glahn, V. (2000) Logratio analysis and compositional distance, Mathematical Geology, 32(3), 271-275.

Hron, K., Templ, M., Filzmoser, P. (2010) Imputation of missing values for compositional data using classical and robust methods Computational Statistics and Data Analysis, 54 (12), 3095-3107.

`impCoda`

### Examples

``````
data(expenditures)
x <- expenditures
x[1,3]
x[1,3] <- NA
xi <- impKNNa(x)\$xImp
xi[1,3]

``````

