knn.impute: Perform imputation of a data frame using k-NN.

Description Usage Arguments Value

View source: R/imputation.R

Description

Perform imputation of missing data in a data frame using the k-Nearest Neighbour algorithm. For discrete variables we use the mode, for continuous variables the median value is instead taken.

Usage

1
2
3
4
5
6
7
knn.impute(
  data,
  k = 10,
  cat.var = 1:ncol(data),
  to.impute = 1:nrow(data),
  using = 1:nrow(data)
)

Arguments

data

a numerical matrix.

k

number of neighbours to be used; for categorical variables the mode of the neighbours is used, for continuous variables the median value is used instead. Default: 10.

cat.var

vector containing the indices of the variables to be considered as categorical. Default: all variables.

to.impute

vector indicating which rows of the dataset are to be imputed. Default: impute all rows.

using

vector indicating which rows of the dataset are to be used to search for neighbours. Default: use all rows.

Value

imputed matrix.


tavazzie/bnstructScore documentation built on Dec. 23, 2021, 7:47 a.m.