impute_xgboost: Fast imputation of missing values by extreme gradien boosting
In yatzy/xgbimpute: Imputation Using XGBoost

Description Usage Arguments Value Examples

Uses the "xgboost" package to do fast missing value imputation by extreme gradien boosting. Between the iterative model fitting, it offers the option of predictive mean matching. This firstly avoids imputation with values not present in the original mat (like a value 0.3334 in a 0-1 coded variable). Secondly, predictive mean matching tries to raise the variance in the resulting conditional distributions to a realistic level and, as such, allows to do multiple imputation when repeating the call to impute_xgboost(). The iterative chaining stops as soon as max_iterations is reached or if the average out-of-bag estimate of performance stops improving. In the latter case, except for the first iteration, the second last (i.e. best) imputed matrix is returned.

1
2
3

impute_xgboost(mat, max_iterations = 10L, seed = NULL, verbose = 1,
  pmm_k = 0, nrounds = 40, eta = 0.4, max_depth = 6,
  objective = "reg:linear", eval_metric = "rmse", ...)

`mat`	A `matrix` with missing values to impute.
`max_iterations`	Maximum number of chaining iterations.
`seed`	Integer seed to initialize the random generator.
`verbose`	Controls how much info is printed to screen. 0 to print nothing. 1 (default) to print a "." per iteration and standardized prediction error , 2 to print model convergences.
`pmm_k`	Number of candidate non-missing values to sample from in the predictive mean matching step. 0 to avoid this step.
`nrounds`	max number of boosting iterations.
`eta`	eta control the learning rate: scale the contribution of each tree by a factor of 0 < eta < 1 when it is added to the current approximation. Used to prevent overfitting by making the boosting process more conservative. Lower value for eta implies larger value for nrounds: low eta value means model more robust to overfitting but slower to compute. Default: 0.3
`max_depth`	maximum depth of a tree. Default: 6
`objective`	specify the learning task and the corresponding learning objective, default 'reg:linear'
`eval_metric`	evaluation metrics for validation data. Default 'rmse'
`...`	Arguments passed to `xgboost`.

An imputed matrix.

1
2
3

mat = as.matrix(iris[,1:4])
mis_mat = generate_na(mat , 0.3)
imp_mat = impute_xgboost(mis_mat)

yatzy/xgbimpute documentation built on June 7, 2019, 8:16 p.m.

yatzy/xgbimpute index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

yatzy/xgbimpute
Imputation Using XGBoost

impute_xgboost: Fast imputation of missing values by extreme gradien boosting
In yatzy/xgbimpute: Imputation Using XGBoost

Description

Usage

Arguments

Value

Examples

Related to impute_xgboost in yatzy/xgbimpute...

R Package Documentation

Browse R Packages

We want your feedback!

yatzy/xgbimpute Imputation Using XGBoost

impute_xgboost: Fast imputation of missing values by extreme gradien boosting In yatzy/xgbimpute: Imputation Using XGBoost

Description

Usage

Arguments

Value

Examples

Related to impute_xgboost in yatzy/xgbimpute...

R Package Documentation

Browse R Packages

We want your feedback!

yatzy/xgbimpute
Imputation Using XGBoost

impute_xgboost: Fast imputation of missing values by extreme gradien boosting
In yatzy/xgbimpute: Imputation Using XGBoost