ml_mlp: Multilayer Perceptron Classification Model

Description Usage Arguments Value Note See Also Examples

View source: R/ml_classification.R

Description

ml_mlp fits a multi-layer perceptron neural network model against a spark_tbl. Users can call summary to print a summary of the fitted model, predict to make predictions on new data, and write_ml/read_ml to save/load fitted models. Only categorical data is supported. For more details, see Multilayer Perceptron

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
ml_mlp(
  data,
  formula,
  layers,
  blockSize = 128,
  solver = "l-bfgs",
  maxIter = 100,
  tol = 1e-06,
  stepSize = 0.03,
  seed = NULL,
  initialWeights = NULL,
  handleInvalid = c("error", "keep", "skip")
)

## S4 method for signature 'MultilayerPerceptronClassificationModel'
summary(object)

## S4 method for signature 'MultilayerPerceptronClassificationModel'
predict(object, newData)

## S4 method for signature 'MultilayerPerceptronClassificationModel,character'
write_ml(object, path, overwrite = FALSE)

Arguments

data

a spark_tbl of observations and labels for model fitting.

formula

a symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'.

layers

integer vector containing the number of nodes for each layer.

blockSize

blockSize parameter.

solver

solver parameter, supported options: "gd" (minibatch gradient descent) or "l-bfgs".

maxIter

maximum iteration number.

tol

convergence tolerance of iterations.

stepSize

stepSize parameter.

seed

seed parameter for weights initialization.

initialWeights

initialWeights parameter for weights initialization, it should be a numeric vector.

handleInvalid

How to handle invalid data (unseen labels or NULL values) in features and label column of string type. Supported options: "skip" (filter out rows with invalid data), "error" (throw an error), "keep" (put invalid data in a special additional bucket, at index numLabels). Default is "error".

object

a Multilayer Perceptron Classification Model fitted by ml_mlp

newData

a spark_tbl for testing.

path

the directory where the model is saved.

overwrite

overwrites or not if the output path already exists. Default is FALSE which means throw exception if the output path exists.

...

additional arguments passed to the method.

Value

summary returns summary information of the fitted model, which is a list. The list includes numOfInputs (number of inputs), numOfOutputs (number of outputs), layers (array of layer sizes including input and output layers), and weights (the weights of layers). For weights, it is a numeric vector with length equal to the expected given the architecture (i.e., for 8-10-2 network, 112 connection weights).

predict returns a spark_tbl containing predicted labeled in a column named "prediction".

Note

summary(MultilayerPerceptronClassificationModel) since 2.1.0

predict(MultilayerPerceptronClassificationModel) since 2.1.0

write_ml(MultilayerPerceptronClassificationModel, character) since 2.1.0

See Also

write_ml

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Not run: 
df <- spark_read_source("data/mllib/sample_multiclass_classification_data.txt",
                        source = "libsvm")

# fit a Multilayer Perceptron Classification Model
model <- ml_mlp(df, label ~ features, blockSize = 128, layers = c(4, 3),
                solver = "l-bfgs", maxIter = 100, tol = 0.5, stepSize = 1,
                seed = 1, initialWeights = c(0, 0, 0, 0, 0, 5, 5, 5, 5, 5, 9, 9, 9, 9, 9))

# get the summary of the model
summary(model)

## End(Not run)

danzafar/tidyspark documentation built on Sept. 30, 2020, 12:19 p.m.