R/tune-grid-spark.R

Defines functions tune_grid_spark

Documented in tune_grid_spark

#' Perform Tidymodels grid search tuning inside Spark
#' @details
#' The parsnip model, or the unfitted `workflow()`, the pre-processor and the
#' re-samples are uploaded to the Spark cluster as R objects. The Spark cluster
#' runs each tuning combinations in parallel using R, Tidymodels and any other
#' modeling packages used for the given workflow. The results are then
#' downloaded back to the local R session.
#'
#' @param sc A Spark Connection
#' @param object A parsnip model specification or an unfitted `workflow()`. No
#' tuning parameters are allowed; if arguments have been marked with `tune()`,
#' their values must be finalized.
#' @param ...	Not currently used.
#' @param preprocessor A traditional model formula or a recipe created using
#' `recipes::recipe().`
#' @param resamples	An rset resampling object created from an rsample function,
#' such as `rsample::vfold_cv()`.
#' @param param_info A `dials::parameters()` object or NULL. If none is given,
#' a parameters set is derived from other arguments. Passing this argument can
#'  be useful when parameter ranges need to be customized.
#' @param grid A data frame of tuning combinations or a positive integer. The
#' data frame should have columns for each parameter being tuned and rows for
#'  tuning parameter candidates. An integer denotes the number of candidate
#'   parameter sets to be created automatically.
#' @param metrics A `yardstick::metric_set()`, or NULL to compute a standard
#' set of metrics.
#' @param eval_time A numeric vector of time points where dynamic event time
#'  metrics should be computed (e.g. the time-dependent ROC curve, etc). The
#'  values must be non-negative and should probably be no greater than the
#'  largest event time in the training set (See Details below).
#' @param control	An object used to modify the tuning process, likely created
#' by `control_grid()`.
#' @param no_tasks Number of parallel tasks (jobs) to request Spark
#' @returns tune_results object
#' @export
tune_grid_spark <- function(
  sc,
  object,
  preprocessor,
  resamples,
  ...,
  param_info = NULL,
  grid = 10,
  metrics = NULL,
  eval_time = NULL,
  control = NULL,
  no_tasks = NULL
) {
  UseMethod("tune_grid_spark")
}

Try the sparklyr package in your browser

Any scripts or data that you put into this service are public.

sparklyr documentation built on April 18, 2026, 9:07 a.m.