Description Usage Arguments Value Examples
This function takes in an x and y vector and attempts to replace missing values. Replacement can be done with the mean or median value of the non-missing values or using a regression replace technique. The regression technique first finds the predicted response value of y for the non missing elements of x. The average value of y for the missing values is then used to find the x-value that most closely aligns with this average value of y for the missing values.
1 |
x |
independent variable, must be a vector. If x is a factor it will be converted to a character vector |
y |
dependent variable, must be a vector and can't contain missing values |
replace |
The type of |
A list containing
x_orig
The original x vector
x_replace
The new vector with missing values filled in
na_indicator
A dummy variable (0 or 1) with that tells you the indices of missing values
replacement_val
The value that the means were replaced with
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | y <- c(sample(c(0, 1), 25, replace = T, prob = c(.25, .75)), sample(c(0, 1), 25, replace = T, prob = c(.4, .6)), sample(c(0, 1), 25, replace = T, prob = c(.6, .4)), sample(c(0, 1), 25, replace = T, prob = c(.75, .25)))
set.seed(924516)
x_real <- (seq(1, 10, length.out = 100))^2 + runif(100, 0, 5)
x <- x_real
x[1:50][runif(50) < .33] <- NA
nafix <- NAReplace(x, y, replace = "mean")
nafix$replacement_val
# [1] 45.56847
nafix <- NAReplace(x, y, replace = "linear")
nafix$replacement_val
# [1] 11.80703
x_char <- c(sample(letters[1:10], 50, replace = T), sample(letters[1:5], 50, replace = T))
x_char[50:100][runif(50) < .25] <- NA
nafix <- NAReplace(x_char, y, replace = "median")
nafix$replacement_val
# [1] "c"
nafix <- NAReplace(x_char, y, replace = "linear")
nafix$replacement_val
# [1] "d"
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.