impKNNa | R Documentation |
This function offers several k-nearest neighbor methods for the imputation of missing values in compositional data.
impKNNa(
x,
method = "knn",
k = 3,
metric = "Aitchison",
agg = "median",
primitive = FALSE,
normknn = TRUE,
das = FALSE,
adj = "median"
)
x |
data frame or matrix |
method |
method (at the moment, only “knn” can be used) |
k |
number of nearest neighbors chosen for imputation |
metric |
“Aichison” or “Euclidean” |
agg |
“median” or “mean”, for the aggregation of the nearest neighbors |
primitive |
if TRUE, a more enhanced search for the $k$-nearest neighbors is obtained (see details) |
normknn |
An adjustment of the imputed values is performed if TRUE |
das |
depricated. if TRUE, the definition of the Aitchison distance, based on simple logratios of the compositional part, is used (Aitchison, 2000) to calculate distances between observations. if FALSE, a version using the clr transformation is used. |
adj |
either ‘median’ (default) or ‘sum’ can be chosen for the adjustment of the nearest neighbors, see Hron et al., 2010. |
The Aitchison metric
should be chosen when dealing with compositional
data, the Euclidean metric
otherwise.
If primitive
==
FALSE, a sequential search for the
k
-nearest neighbors is applied for every missing value where all
information corresponding to the non-missing cells plus the information in
the variable to be imputed plus some additional information is available. If
primitive
==
TRUE, a search of the k
-nearest neighbors
among observations is applied where in addition to the variable to be
imputed any further cells are non-missing.
If normknn
is TRUE (prefered option) the imputed cells from a nearest
neighbor method are adjusted with special adjustment factors (more details
can be found online (see the references)).
xOrig |
Original data frame or matrix |
xImp |
Imputed data |
w |
Amount of imputed values |
wind |
Index of the missing values in the data |
metric |
Metric used |
Matthias Templ
Aitchison, J., Barcelo-Vidal, C., Martin-Fernandez, J.A., Pawlowsky-Glahn, V. (2000) Logratio analysis and compositional distance, Mathematical Geology, 32(3), 271-275.
Hron, K., Templ, M., Filzmoser, P. (2010) Imputation of missing values for compositional data using classical and robust methods Computational Statistics and Data Analysis, 54 (12), 3095-3107.
impCoda
data(expenditures)
x <- expenditures
x[1,3]
x[1,3] <- NA
xi <- impKNNa(x)$xImp
xi[1,3]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.