ml_isoreg: Isotonic Regression Model
In danzafar/tidyspark: A Tidy Interface to Spark

Description Usage Arguments Value Note Examples

Fits an Isotonic Regression model against a spark_tbl, similarly to R's isoreg(). Users can print, make predictions on the produced model and save the model to the input path.

ml_isoreg(data, formula, isotonic = TRUE, featureIndex = 0, weightCol = NULL)

## S4 method for signature 'IsotonicRegressionModel'
summary(object)

## S4 method for signature 'IsotonicRegressionModel'
predict(object, newData)

## S4 method for signature 'IsotonicRegressionModel,character'
write_ml(object, path, overwrite = FALSE)

`data`	spark_tbl for training.
`formula`	A symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'.
`isotonic`	Whether the output sequence should be isotonic/increasing (TRUE) or antitonic/decreasing (FALSE).
`featureIndex`	The index of the feature if `featuresCol` is a vector column (default: 0), no effect otherwise.
`weightCol`	The weight column name.
`object`	a fitted IsotonicRegressionModel.
`newData`	spark_tbl for testing.
`path`	The directory where the model is saved.
`overwrite`	Overwrites or not if the output path already exists. Default is FALSE which means throw exception if the output path exists.
`...`	additional arguments passed to the method.

ml_isotonic_regression returns a fitted Isotonic Regression model.

summary returns summary information of the fitted model, which is a list. The list includes model's boundaries (boundaries in increasing order) and predictions (predictions associated with the boundaries at the same index).

predict returns a spark_tbl containing predicted values.

spark.isoreg since 2.1.0

summary(IsotonicRegressionModel) since 2.1.0

predict(IsotonicRegressionModel) since 2.1.0

write_ml(IsotonicRegression, character) since 2.1.0

## Not run: 
spark_session()
data <- tribble(~label, ~feature,
                7.0, 0.0,
                5.0, 1.0,
                3.0, 2.0,
                5.0, 3.0,
                1.0, 4.0)

df <- spark_tbl(data)
model <- ml_isoreg(df, label ~ feature, isotonic = FALSE)
# return model boundaries and prediction as lists
result <- summary(model)

# prediction based on fitted model
predict_data <- tibble(feature = c(-2.0, -1.0, 0.5,
                                   0.75, 1.0, 2.0, 9.0))
predict_df <- spark_tbl(predict_data)
# get prediction column
predict_result <- model %>%
  predict(predict_df) %>%
  select(prediction) %>%
  collect

# save fitted model to input path
path <- "path/to/model"
write_ml(model, path)

# can also read back the saved model and print
savedModel <- read_ml(path)
summary(savedModel)

## End(Not run)