replace_missing_data: Replace missing data with median ± random noise

Description Usage Arguments Value

View source: R/replace_missing_data.R

Description

Replace missing data within each numeric column of a data frame with the column median, plus or minus some random noise, in order to train classifiers that do not easily ignore missing data (e.g. random forests or support vector machines).

Usage

1
replace_missing_data(dat, noise_pct = 0.05)

Arguments

dat

the data frame to replace missing data in

noise_pct

the standard deviation of the random normal distribution from which to draw added noise, expressed as a percentage of the standard deviation of the non-missing values in each column

Value

a data frame with missing values in each numeric column replaced by the column median, plus or minus some random noise


PrInCE documentation built on Nov. 8, 2020, 6:34 p.m.