fill_NA: 'fill_NA' function for the imputations purpose.

View source: R/fill_NA.R

fill_NAR Documentation

fill_NA function for the imputations purpose.

Description

Regular imputations to fill the missing data. Non missing independent variables are used to approximate a missing observations for a dependent variable. Quantitative models were built under Rcpp packages and the C++ library Armadillo.

Usage

fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)

## S3 method for class 'data.frame'
fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)

## S3 method for class 'data.table'
fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)

## S3 method for class 'matrix'
fill_NA(x, model, posit_y, posit_x, w = NULL, logreg = FALSE, ridge = 1e-06)

Arguments

x

a numeric matrix or data.frame/data.table (factor/character/numeric/logical) - variables

model

a character - possible options ("lda","lm_pred","lm_bayes","lm_noise")

posit_y

an integer/character - a position/name of dependent variable

posit_x

an integer/character vector - positions/names of independent variables

w

a numeric vector - a weighting variable - only positive values, Default:NULL

logreg

a boolean - if dependent variable has log-normal distribution (numeric). If TRUE log-regression is evaluated and then returned exponential of results., Default: FALSE

ridge

a numeric - a value added to diagonal elements of the x'x matrix, Default: 1e-6

Value

load imputations in a numeric/logical/character/factor (similar to the input type) vector format

Methods (by class)

  • fill_NA(data.frame): S3 method for data.frame

  • fill_NA(data.table): s3 method for data.table

  • fill_NA(matrix): S3 method for matrix

Note

There is assumed that users add the intercept by their own. The miceFast module provides the most efficient environment, the second recommended option is to use data.table and the numeric matrix data type. The lda model is assessed only if there are more than 15 complete observations and for the lms models if number of independent variables is smaller than number of observations.

See Also

fill_NA_N VIF vignette("miceFast-intro", package = "miceFast")

Examples

library(miceFast)
library(dplyr)
library(data.table)

data(air_miss)

# dplyr: continuous variable with Bayesian linear model
air_miss %>%
  mutate(Ozone_imp = fill_NA(
    x = ., model = "lm_bayes",
    posit_y = "Ozone", posit_x = c("Solar.R", "Wind", "Temp")
  ))

# dplyr: categorical variable with LDA
air_miss %>%
  mutate(x_char_imp = fill_NA(
    x = ., model = "lda",
    posit_y = "x_character", posit_x = c("Wind", "Temp")
  ))

# dplyr: grouped imputation with weights
air_miss %>%
  group_by(groups) %>%
  do(mutate(., Solar_R_imp = fill_NA(
    x = ., model = "lm_pred",
    posit_y = "Solar.R",
    posit_x = c("Wind", "Temp", "Intercept"),
    w = .[["weights"]]
  ))) %>%
  ungroup()

# data.table
data(air_miss)
setDT(air_miss)
air_miss[, Ozone_imp := fill_NA(
  x = .SD, model = "lm_bayes",
  posit_y = "Ozone", posit_x = c("Solar.R", "Wind", "Temp")
)]

# data.table: grouped
air_miss[, Solar_R_imp := fill_NA(
  x = .SD, model = "lm_pred",
  posit_y = "Solar.R",
  posit_x = c("Wind", "Temp", "Intercept"),
  w = .SD[["weights"]]
), by = .(groups)]

# See the vignette for full examples:
# vignette("miceFast-intro", package = "miceFast")


miceFast documentation built on Feb. 26, 2026, 5:06 p.m.