na_preprocessing: NA preprocessing

View source: R/na_preprocessing.R

na_preprocessingR Documentation

NA preprocessing

Description

Sometimes na.omit cannot be used with data set because it will end up with too few rows to do anything sensible. This can be relaxed. In practice it is the NA_integer_ and NA_real_ that have to be omitted, but NA_character_ can be retained: just add NA as a factor level. To achieve this, you need to loop through variables in your data frame:

  • if a variable x is already a factor and anyNA(x) is TRUE, do x <- addNA(x). The "and" is important. If x has no NA, addNA(x) will add an unused ⁠<NA>⁠ level.

  • if a variable x is a character, do x <- factor(x, exclude = NULL) to coerce it to a factor. exclude = NULL will retain ⁠<NA>⁠ as a level.

  • if x is "logical", "numeric", "raw" or "complex", nothing should be changed. NA is just NA.

⁠<NA>⁠ factor level will not be dropped by droplevels or na.omit, and it is valid for building a model matrix. Check the following examples.

Once you add NA as a level in a factor / character, your dataset might suddenly have more complete cases. Then you can run your model. If you still get a "contrasts error", use debug_contr_error2 to see what has happened.

Credit for this function goes to this amazing stack overflow post

Usage

na_preprocessing(dat)

Arguments

dat

The full data set

Value

a data frame, with NA added as a level for factor / character.

Examples

dat <- data.frame(y = 1:5,
                  x = factor(c(letters[1:4], NA)),
                  z = c(letters[1:4], NA))
dat
na_preprocessing(dat)

na.omit(dat)
na.omit(na_preprocessing(dat))

emilelatour/lamisc documentation built on April 9, 2024, 10:33 a.m.