| fill_NA_N | R Documentation |
fill_NA_N function for the multiple imputations purposeMultiple imputations to fill the missing data. Non missing independent variables are used to approximate a missing observations for a dependent variable. Quantitative models were built under Rcpp packages and the C++ library Armadillo.
fill_NA_N(
x,
model,
posit_y,
posit_x,
w = NULL,
logreg = FALSE,
k = 10,
ridge = 1e-06
)
## S3 method for class 'data.frame'
fill_NA_N(
x,
model,
posit_y,
posit_x,
w = NULL,
logreg = FALSE,
k = 10,
ridge = 1e-06
)
## S3 method for class 'data.table'
fill_NA_N(
x,
model,
posit_y,
posit_x,
w = NULL,
logreg = FALSE,
k = 10,
ridge = 1e-06
)
## S3 method for class 'matrix'
fill_NA_N(
x,
model,
posit_y,
posit_x,
w = NULL,
logreg = FALSE,
k = 10,
ridge = 1e-06
)
x |
a numeric matrix or data.frame/data.table (factor/character/numeric/logical) - variables |
model |
a character - possible options ("lm_bayes","lm_noise","pmm") |
posit_y |
an integer/character - a position/name of dependent variable |
posit_x |
an integer/character vector - positions/names of independent variables |
w |
a numeric vector - a weighting variable - only positive values, Default: NULL |
logreg |
a boolean - if dependent variable has log-normal distribution (numeric). If TRUE log-regression is evaluated and then returned exponential of results., Default: FALSE |
k |
an integer - a number of multiple imputations or for pmm a number of closest points from which a one random value is taken, Default:10 |
ridge |
a numeric - a value added to diagonal elements of the x'x matrix, Default: 1e-6 |
load imputations in a numeric/character/factor (similar to the input type) vector format
fill_NA_N(data.frame): s3 method for data.frame
fill_NA_N(data.table): S3 method for data.table
fill_NA_N(matrix): S3 method for matrix
It is assumed that users add the intercept column themselves.
The miceFast module provides the most efficient environment; the second recommended option is data.table with a numeric matrix.
Only "lm_bayes", "lm_noise", and "pmm" models are supported.
The model is fitted only when the number of complete observations exceeds the number of independent variables.
fill_NA VIF vignette("miceFast-intro", package = "miceFast")
library(miceFast)
library(dplyr)
library(data.table)
data(air_miss)
# dplyr: PMM with 20 draws
air_miss %>%
mutate(Ozone_pmm = fill_NA_N(
x = ., model = "pmm",
posit_y = "Ozone", posit_x = c("Solar.R", "Wind", "Temp"),
k = 20
))
# dplyr: lm_noise with weights
air_miss %>%
mutate(Ozone_imp = fill_NA_N(
x = ., model = "lm_noise",
posit_y = "Ozone",
posit_x = c("Solar.R", "Wind", "Temp"),
w = .[["weights"]],
logreg = TRUE, k = 30
))
# data.table: PMM grouped
data(air_miss)
setDT(air_miss)
air_miss[, Ozone_pmm := fill_NA_N(
x = .SD, model = "pmm",
posit_y = "Ozone",
posit_x = c("Wind", "Temp", "Intercept"),
k = 20
), by = .(groups)]
# See the vignette for full examples:
# vignette("miceFast-intro", package = "miceFast")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.