missingFix | R Documentation |
This function imputes missing values in a data frame based on specified methods for numerical and categorical variables. Additionally, it can add flag columns to indicate missing values. For numerical variables, missing values can be imputed using the mean or median. For categorical variables, missing values can be imputed using the mode or a new level. This function also removes constant columns (all NAs or all observed but the same value).
missingFix(data, missingMethod = c("medianFlag", "newLevel"))
data |
A data frame containing the data to be processed. Missing values
( |
missingMethod |
A character vector of length 2 specifying the methods
for imputing missing values. The first element specifies the method for
numerical variables ( |
A list with two elements:
data |
The original data frame with missing values imputed, and flag columns added if applicable. |
ref |
A reference row containing the imputed values and flag levels, which can be used for future predictions or reference. |
dat <- data.frame(
X1 = rep(NA, 5),
X2 = factor(rep(NA, 5), levels = LETTERS[1:3]),
X3 = 1:5,
X4 = LETTERS[1:5],
X5 = c(NA, 2, 3, 10, NA),
X6 = factor(c("A", NA, NA, "B", "B"), levels = LETTERS[1:3])
)
missingFix(dat)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.