Mixgb.train: Multiple imputation through xgboost R6 class imputer object...

Mixgb.trainR Documentation

Multiple imputation through xgboost R6 class imputer object for training set

Description

Set up an xgboost imputer object with specified hyperparameters and then obtain an imputed object including multiple imputed datasets, saved models and parameters.

Methods

Public methods


Method new()

Create a new Mixgb object. This is used to set up the multiple imputation imputer using xgboost.

Usage
Mixgb.train$new(
  data,
  nrounds = 50,
  max_depth = 6,
  gamma = 0.1,
  eta = 0.3,
  nthread = 4,
  early_stopping_rounds = 10,
  colsample_bytree = 1,
  min_child_weight = 1,
  subsample = 1,
  pmm.k = 5,
  pmm.type = "auto",
  pmm.link = "logit",
  scale_pos_weight = 1,
  initial.imp = "random",
  tree_method = "auto",
  gpu_id = 0,
  predictor = "auto",
  print_every_n = 10L,
  verbose = 0
)
Arguments
data

A data frame with missing values

nrounds

max number of boosting iterations. Default: 50

max_depth

maximum depth of the tree. Default: 6

gamma

Default: 0.1

eta

Default: 0.3

nthread

Default: 4

early_stopping_rounds

Default: 10,

colsample_bytree

Default: 1

min_child_weight

Default: 1

subsample

Default: 1

pmm.k

Default: 5

pmm.type

Default: "auto" (used to be NULL)

pmm.link

Default: "logit"

scale_pos_weight

Default:1

initial.imp

Default: "random"

tree_method

Default: "auto" (can set "gpu_hist" for linux)

gpu_id

Device ordinal. Default: 0

predictor

The type of predictor algorithm to use. Default: "auto" (other options: "cpu_predictor","gpu_predictor")

print_every_n

Default: 10L

verbose

Default: 0

Examples
MIXGB=Mixgb.train$new(withNA.df)
MIXGB=Mixgb.train$new(withNA.df,nrounds=50,max_depth=6)

Method impute()

Use the imputer to impute missing values and obtain multiple imputed datasets, saved training models and some parameters needed for future use.

Usage
Mixgb.train$impute(m = 5, save.vars = NULL)
Arguments
m

the number of imputed datasets. Default: 5

save.vars

the names or indices of variables that users want to save models for. Default: NULL. By default, save.vars=NULL, imputation models for all variables will be saved for imputing future data. However, if users know that future data will only have missing values in certain variables, they can choose to save models only for those variables.

Examples
MIXGB=Mixgb.train$new(withNA.df)
mixgb.obj=MIXGB$impute(m = 5)

Examples


## ------------------------------------------------
## Method `Mixgb.train$new`
## ------------------------------------------------

MIXGB=Mixgb.train$new(withNA.df)
MIXGB=Mixgb.train$new(withNA.df,nrounds=50,max_depth=6)

## ------------------------------------------------
## Method `Mixgb.train$impute`
## ------------------------------------------------

MIXGB=Mixgb.train$new(withNA.df)
mixgb.obj=MIXGB$impute(m = 5)

agnesdeng/misle documentation built on Sept. 22, 2023, 8:48 p.m.