gbmImpute: GBM Imputation

Description Usage Arguments Examples

View source: R/gbmImpute.R

Description

Imputation using Boosted Trees Fill each column by treating it as a regression problem. For each column i, use boosted regression trees to predict i using all other columns except i. If the predictor variables also contain missing data, the gbm function will itself use surrogate variables as substitutes for the predictors. This imputation function can handle both categorical and numeric data.

Usage

1
2
  gbmImpute(x, max.iters = 2, cv.fold = 2, n.trees = 100,
    verbose = T, ...)

Arguments

x

a data frame or matrix where each row is a different record

max.iters

number of times to iterate through the columns and impute each column with fitted values from a regression tree

cv.fold

number of folds that gbm should use internally for cross validation

n.trees

the number of trees used in gradient boosting machines

verbose

if TRUE print status updates

...

additional params passed to gbm

Examples

1
2
3
4
x = matrix(rnorm(10000),1000,10)
  x.missing = x > 2
  x[x.missing] = NA
  gbmImpute(x)

jeffwong/imputation documentation built on May 19, 2019, 4:02 a.m.