knnImp: Fill in NA values with the values of the nearest neighbours
In ltorgo/performanceEstimation: An Infra-Structure for Performance Estimation of Predictive Models

Description Usage Arguments Details Value Author(s) References See Also Examples

Function that fills in all NA values using the k Nearest Neighbours of each case with NA values. It uses the median/most frequent value within the neighbours to fill in the NAs.

1	knnImp(data, k = 10, scale = TRUE, distData = NULL)

`data`	A data frame with the data set
`k`	The number of nearest neighbours to use (defaults to 10)
`scale`	Boolean setting if the data should be scale before finding the nearest neighbours (defaults to `TRUE`)
`distData`	Optionally you may sepecify here a data frame containing the data set that should be used to find the neighbours. This is usefull when filling in NA values on a test set, where you should use only information from the training set. This defaults to NULL, which means that the neighbours will be searched in `data`

This function uses the k-nearest neighbours to fill in the unknown (NA) values in a data set. For each case with any NA value it will search for its k most similar cases and use the values of these cases to fill in the unknowns.

The function will use either the median (in case of numeric variables) or the most frequent value (in case of factors), of the neighbours to fill in the NAs.

A data frame without NA values

Luis Torgo ltorgo@dcc.fc.up.pt

Torgo, L. (2014) An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R. arXiv:1412.0436 [cs.MS] http://arxiv.org/abs/1412.0436

na.omit

## Not run: 
data(algae,package="DMwR")
cleanAlgae <- knnImp(algae)
summary(cleanAlgae)

## End(Not run)

ltorgo/performanceEstimation documentation built on May 21, 2019, 8:41 a.m.

ltorgo/performanceEstimation index

README.md An Infra-Structure for Performance Estimation and Experimental Comparison of Predictive Models in R

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ltorgo/performanceEstimation
An Infra-Structure for Performance Estimation of Predictive Models

knnImp: Fill in NA values with the values of the nearest neighbours
In ltorgo/performanceEstimation: An Infra-Structure for Performance Estimation of Predictive Models

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to knnImp in ltorgo/performanceEstimation...

R Package Documentation

Browse R Packages

We want your feedback!

ltorgo/performanceEstimation An Infra-Structure for Performance Estimation of Predictive Models

knnImp: Fill in NA values with the values of the nearest neighbours In ltorgo/performanceEstimation: An Infra-Structure for Performance Estimation of Predictive Models

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to knnImp in ltorgo/performanceEstimation...

R Package Documentation

Browse R Packages

We want your feedback!

ltorgo/performanceEstimation
An Infra-Structure for Performance Estimation of Predictive Models

knnImp: Fill in NA values with the values of the nearest neighbours
In ltorgo/performanceEstimation: An Infra-Structure for Performance Estimation of Predictive Models