man/rmd/linear_reg_spark.md

For this engine, there is a single mode: regression

Tuning Parameters

This model has 2 tuning parameters:

For penalty, the amount of regularization includes both the L1 penalty (i.e., lasso) and the L2 penalty (i.e., ridge or weight decay). As for mixture:

Translation from parsnip to the original package

linear_reg(penalty = double(1), mixture = double(1)) %>% 
  set_engine("spark") %>% 
  translate()
## Linear Regression Model Specification (regression)
## 
## Main Arguments:
##   penalty = double(1)
##   mixture = double(1)
## 
## Computational engine: spark 
## 
## Model fit template:
## sparklyr::ml_linear_regression(x = missing_arg(), formula = missing_arg(), 
##     weights = missing_arg(), reg_param = double(1), elastic_net_param = double(1))

Preprocessing requirements

Factor/categorical predictors need to be converted to numeric values (e.g., dummy or indicator variables) for this engine. When using the formula method via \code{\link[=fit.model_spec]{fit()}}, parsnip will convert factor columns to indicators.

Predictors should have the same scale. One way to achieve this is to center and scale each so that each predictor has mean zero and a variance of one.

By default, ml_linear_regression() uses the argument standardization = TRUE to center and scale the data.

Case weights

This model can utilize case weights during model fitting. To use them, see the documentation in [case_weights] and the examples on tidymodels.org.

The fit() and fit_xy() arguments have arguments called case_weights that expect vectors of case weights.

Note that, for spark engines, the case_weight argument value should be a character string to specify the column with the numeric case weights.

Other details

For models created using the "spark" engine, there are several things to consider.

References



topepo/parsnip documentation built on April 16, 2024, 3:23 a.m.