mice.impute.rfnode: Univariate sampler function for mixed types of variables for...

View source: R/mice.impute.rfnode.R

mice.impute.rfnodeR Documentation

Univariate sampler function for mixed types of variables for node-based imputation, using predicting nodes of random forests

Description

Please note that functions with names starting with "mice.impute" are exported to be visible for the mice sampler functions. Please do not call these functions directly unless you know exactly what you are doing.

RfNode imputation methods, adapter for mice samplers. These functions can be called by the mice sampler functions.

mice.impute.rfnode.cond is for imputation using the conditional formed by the predicting nodes of random forests. To use this function, set method = "rfnode.cond" in mice function.

mice.impute.rfnode.prox is for imputation based on proximity measures from random forests, and provides functionality similar to mice.impute.rf. To use this function, set method = "rfnode.prox" in mice function.

mice.impute.rfnode is the main function for performing imputation, and both mice.impute.rfnode.cond and mice.impute.rfnode.prox call this function. By default, mice.impute.rfnode works like mice.impute.rfnode.cond.

Usage

mice.impute.rfnode(
  y,
  ry,
  x,
  wy = NULL,
  num.trees.node = 10,
  pre.boot = TRUE,
  use.node.cond.dist = TRUE,
  obs.eq.prob = FALSE,
  do.sample = TRUE,
  num.threads = NULL,
  ...
)

mice.impute.rfnode.cond(
  y,
  ry,
  x,
  wy = NULL,
  num.trees = 10,
  pre.boot = TRUE,
  obs.eq.prob = FALSE,
  ...
)

mice.impute.rfnode.prox(
  y,
  ry,
  x,
  wy = NULL,
  num.trees = 10,
  pre.boot = TRUE,
  obs.eq.prob = FALSE,
  ...
)

Arguments

y

Vector to be imputed.

ry

Logical vector of length length(y) indicating the the subset y[ry] of elements in y to which the imputation model is fitted. The ry generally distinguishes the observed (TRUE) and missing values (FALSE) in y.

x

Numeric design matrix with length(y) rows with predictors for y. Matrix x may have no missing values.

wy

Logical vector of length length(y). A TRUE value indicates locations in y for which imputations are created.

num.trees.node

Number of trees to build, default to 10. For function mice.impute.rfnode only.

pre.boot

Perform bootstrap prior to imputation to get 'proper' imputation, i.e. accommodating sampling variation in estimating population regression parameters (see Shah et al. 2014).

use.node.cond.dist

If TRUE, use conditional distribution formed by predicting nodes of random forest (out-of-bag observations were excluded); if FALSE, use proximity-based imputation.

obs.eq.prob

If TRUE, the candidate observations will be sampled with equal probability.

do.sample

If TRUE, draw samples for missing observations. If FALSE, the corresponding observations numbers will be returned, for testing purposes only, and WILL CAUSE ERRORS for the mice sampler function.

num.threads

Number of threads for parallel computing. The default is num.threads = NULL and all the processors available can be used.

...

Other arguments to pass down.

num.trees

Number of trees to build, default to 10.

Details

Advanced users can get more flexibility from mice.impute.rfnode function, as it provides more options than mice.impute.rfnode.cond or mice.impute.rfnode.prox.

Value

Vector with imputed data, same type as y, and of length sum(wy).

Author(s)

Shangzhi Hong

References

Hong, Shangzhi, et al. "Multiple imputation using chained random forests." Preprint, submitted April 30, 2020. https://arxiv.org/abs/2004.14823.

Doove, Lisa L., Stef Van Buuren, and Elise Dusseldorp. "Recursive partitioning for missing data imputation in the presence of interaction effects." Computational Statistics & Data Analysis 72 (2014): 92-104.

Examples

# Prepare data: convert categorical variables to factors
nhanes.fix <- conv.factor(nhanes, c("age", "hyp"))

# Using "rfnode.cond" or "rfnode"
impRfNodeCond <- mice(nhanes.fix, method = "rfnode.cond", m = 5,
maxit = 5, maxcor = 1.0, eps = 0, printFlag = FALSE)

# Using "rfnode.prox"
impRfNodeProx <- mice(nhanes.fix, method = "rfnode.prox", m = 5,
maxit = 5, maxcor = 1.0, eps = 0,
remove.collinear = FALSE, remove.constant = FALSE,
printFlag = FALSE)


RfEmpImp documentation built on Oct. 20, 2022, 9:06 a.m.