impute_missing_values: Impute missing values in a dataframe and add missingness...

Description Usage Arguments Value See Also Examples

View source: R/impute_missing_values.R

Description

Impute missing values, using knn by default or alternatively median-impute numerics, mode-impute factors. Add missingness indicators.

Usage

1
2
impute_missing_values(data, type = "standard", add_indicators = T,
  prefix = "miss_", skip_vars = c(), verbose = F)

Arguments

data

Dataframe or matrix.

type

"knn" or "standard" (median/mode). NOTE: knn will result in the data being centered and scaled!

add_indicators

Add a series of missingness indicators.

prefix

String to add at the beginning of the name of each missingness indicator.

skip_vars

List of variable names to exclude from the imputation.

verbose

If True display extra information during execution.

Value

List with the following elements:

See Also

missingness_indicators preProcess

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Load a test dataset.
data(PimaIndiansDiabetes2, package = "mlbench")

# Check for missing values.
colSums(is.na(PimaIndiansDiabetes2))

# Impute missing data and add missingness indicators.
# Don't impute the outcome though.
result = impute_missing_values(PimaIndiansDiabetes2, skip_vars = "diabetes")

# Confirm we have no missing data.
colSums(is.na(result$data))


#############
# K-nearest neighbors imputation

result2 = impute_missing_values(PimaIndiansDiabetes2, type = "knn", skip_vars = "diabetes")

# Confirm we have no missing data.
colSums(is.na(result2$data))

ck37r documentation built on June 4, 2017, 1:02 a.m.