XGBoost models

if (requireNamespace("xgboost", quietly = TRUE)) {
  library(tidypredict)
  library(xgboost)
  library(dplyr)
  eval_code <- TRUE
} else {
  eval_code <- FALSE
}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = eval_code
)

| Function |Works| |---------------------------------------------------------------|-----| |tidypredict_fit(), tidypredict_sql(), parse_model() | ✔ | |tidypredict_to_column() | ✔ | |tidypredict_test() | ✔ | |tidypredict_interval(), tidypredict_sql_interval() | ✗ | |parsnip | ✔ |

tidypredict_ functions

library(xgboost)

logregobj <- function(preds, dtrain) {
  labels <- xgboost::getinfo(dtrain, "label")
  preds <- 1 / (1 + exp(-preds))
  grad <- preds - labels
  hess <- preds * (1 - preds)
  return(list(grad = grad, hess = hess))
}

xgb_bin_data <- xgboost::xgb.DMatrix(
  as.matrix(mtcars[, -9]),
  label = mtcars$am
)

model <- xgboost::xgb.train(
  params = list(max_depth = 2, objective = "binary:logistic", base_score = 0.5),
  data = xgb_bin_data,
  nrounds = 50
)

mtcars %>% tidypredict_to_column(model) %>% glimpse() ```

Please be aware that xgboost converts data into 32-bit floats internally. This could possibly lead to splits being done incorrectly. Always verify that the predictions match up with model predictions. See this issue for more information.

parsnip

parsnip fitted models are also supported by tidypredict:

library(parsnip)

p_model <- boost_tree(mode = "regression") %>%
  set_engine("xgboost") %>%
  fit(am ~ ., data = mtcars)
tidypredict_test(p_model, mtcars, xg_df = xgb_bin_data)

Parse model spec

Here is an example of the model spec:

pm <- parse_model(model)
str(pm, 2)
str(pm$trees[1])


Try the tidypredict package in your browser

Any scripts or data that you put into this service are public.

tidypredict documentation built on Dec. 13, 2025, 9:06 a.m.