Description Usage Arguments Value Examples
Uses the "xgboost" package to do fast missing value imputation by extreme gradien boosting.
Between the iterative model fitting, it offers the option of predictive mean matching. This firstly avoids imputation
with values not present in the original mat (like a value 0.3334 in a 0-1 coded variable). Secondly, predictive mean
matching tries to raise the variance in the resulting conditional distributions to a realistic level and, as such,
allows to do multiple imputation when repeating the call to impute_xgboost(). The iterative chaining stops as soon as max_iterations
is reached or if the average out-of-bag estimate of performance stops improving. In the latter case, except for the first iteration,
the second last (i.e. best) imputed matrix is returned.
1 2 3 | impute_xgboost(mat, max_iterations = 10L, seed = NULL, verbose = 1,
pmm_k = 0, nrounds = 40, eta = 0.4, max_depth = 6,
objective = "reg:linear", eval_metric = "rmse", ...)
|
mat |
A |
max_iterations |
Maximum number of chaining iterations. |
seed |
Integer seed to initialize the random generator. |
verbose |
Controls how much info is printed to screen. 0 to print nothing. 1 (default) to print a "." per iteration and standardized prediction error , 2 to print model convergences. |
pmm_k |
Number of candidate non-missing values to sample from in the predictive mean matching step. 0 to avoid this step. |
nrounds |
max number of boosting iterations. |
eta |
eta control the learning rate: scale the contribution of each tree by a factor of 0 < eta < 1 when it is added to the current approximation. Used to prevent overfitting by making the boosting process more conservative. Lower value for eta implies larger value for nrounds: low eta value means model more robust to overfitting but slower to compute. Default: 0.3 |
max_depth |
maximum depth of a tree. Default: 6 |
objective |
specify the learning task and the corresponding learning objective, default 'reg:linear' |
eval_metric |
evaluation metrics for validation data. Default 'rmse' |
... |
Arguments passed to |
An imputed matrix
.
1 2 3 | mat = as.matrix(iris[,1:4])
mis_mat = generate_na(mat , 0.3)
imp_mat = impute_xgboost(mis_mat)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.