View source: R/pmml.xgb.Booster.R
pmml.xgb.Booster | R Documentation |
Generate PMML for a xgb.Booster object from the package xgboost.
## S3 method for class 'xgb.Booster' pmml( model, model_name = "xboost_Model", app_name = "SoftwareAG PMML Generator", description = "Extreme Gradient Boosting Model", copyright = NULL, model_version = NULL, transforms = NULL, missing_value_replacement = NULL, input_feature_names = NULL, output_label_name = NULL, output_categories = NULL, xgb_dump_file = NULL, parent_invalid_value_treatment = "returnInvalid", child_invalid_value_treatment = "asIs", ... )
model |
An object created by the 'xgboost' function. |
model_name |
A name to be given to the PMML model. |
app_name |
The name of the application that generated the PMML. |
description |
A descriptive text for the Header element of the PMML. |
copyright |
The copyright notice for the model. |
model_version |
A string specifying the model version. |
transforms |
Data transformations. |
missing_value_replacement |
Value to be used as the 'missingValueReplacement' attribute for all MiningFields. |
input_feature_names |
Input variable names used in training the model. |
output_label_name |
Name of the predicted field. |
output_categories |
Possible values of the predicted field, for classification models. |
xgb_dump_file |
Name of file saved using 'xgb.dump' function. |
parent_invalid_value_treatment |
Invalid value treatment at the top MiningField level. |
child_invalid_value_treatment |
Invalid value treatment at the model segment MiningField level. |
... |
Further arguments passed to or from other methods. |
The xgboost
function takes as its input either an xgb.DMatrix
object or
a numeric matrix. The input field information is not stored in the R model object,
hence the field information must be passed on as inputs. This enables the PMML
to specify field names in its model representation. The R model object does not store
information about the fitted tree structure either. However, this information can
be extracted from the xgb.model.dt.tree
function and the file saved using the
xgb.dump
function. The xgboost library is therefore needed in the environment and this
saved file is needed as an input as well.
The following objectives are currently supported: multi:softprob
,
multi:softmax
, binary:logistic
.
The pmml exporter will throw an error if the xgboost model model only has one tree.
The exporter only works with numeric matrices. Sparse matrices must be converted to
matrix
objects before training an xgboost model for the export to work correctly.
PMML representation of the xgb.Booster object.
Tridivesh Jena
xgboost: Extreme Gradient Boosting
pmml
,
PMML schema
## Not run: # Example using the xgboost package example model. library(xgboost) data(agaricus.train, package = "xgboost") data(agaricus.test, package = "xgboost") train <- agaricus.train test <- agaricus.test model1 <- xgboost( data = train$data, label = train$label, max_depth = 2, eta = 1, nthread = 2, nrounds = 2, objective = "binary:logistic" ) # Save the tree information in an external file: xgb.dump(model1, "model1.dumped.trees") # Convert to PMML: model1_pmml <- pmml(model1, input_feature_names = colnames(train$data), output_label_name = "prediction1", output_categories = c("0", "1"), xgb_dump_file = "model1.dumped.trees" ) # Multinomial model using iris data: model2 <- xgboost( data = as.matrix(iris[, 1:4]), label = as.numeric(iris[, 5]) - 1, max_depth = 2, eta = 1, nthread = 2, nrounds = 2, objective = "multi:softprob", num_class = 3 ) # Save the tree information in an external file: xgb.dump(model2, "model2.dumped.trees") # Convert to PMML: model2_pmml <- pmml(model2, input_feature_names = colnames(as.matrix(iris[, 1:4])), output_label_name = "Species", output_categories = c(1, 2, 3), xgb_dump_file = "model2.dumped.trees" ) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.