ml_naive_bayes: Naive Bayes Models

Description Usage Arguments Value Note See Also Examples

View source: R/ml_classification.R

Description

ml_naive_bayes fits a Bernoulli naive Bayes model against a spark_tbl. Users can call summary to print a summary of the fitted model, predict to make predictions on new data, and write_ml/read_ml to save/load fitted models. Only categorical data is supported.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
ml_naive_bayes(
  data,
  formula,
  smoothing = 1,
  handleInvalid = c("error", "keep", "skip")
)

## S4 method for signature 'NaiveBayesModel'
summary(object)

## S4 method for signature 'NaiveBayesModel'
predict(object, newData)

## S4 method for signature 'NaiveBayesModel,character'
write_ml(object, path, overwrite = FALSE)

Arguments

data

a spark_tbl of observations and labels for model fitting.

formula

a symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'.

smoothing

smoothing parameter.

handleInvalid

How to handle invalid data (unseen labels or NULL values) in features and label column of string type. Supported options: "skip" (filter out rows with invalid data), "error" (throw an error), "keep" (put invalid data in a special additional bucket, at index numLabels). Default is "error".

object

a naive Bayes model fitted by ml_naive_bayes.

newData

a spark_tbl for testing.

path

the directory where the model is saved.

overwrite

overwrites or not if the output path already exists. Default is FALSE which means throw exception if the output path exists.

...

additional argument(s) passed to the method. Currently only smoothing.

Value

ml_naive_bayes returns a fitted naive Bayes model.

summary returns summary information of the fitted model, which is a list. The list includes apriori (the label distribution) and tables (conditional probabilities given the target label).

predict returns a spark_tbl containing predicted labeled in a column named "prediction".

Note

ml_naive_bayes since 2.0.0

summary(NaiveBayesModel) since 2.0.0

predict(NaiveBayesModel) since 2.0.0

write_ml(NaiveBayesModel, character) since 2.0.0

See Also

e1071: https://cran.r-project.org/package=e1071

write_ml

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
## Not run: 
data <- as.data.frame(UCBAdmissions)
df <- spark_tbl(data)

# fit a Bernoulli naive Bayes model
model <- ml_naive_bayes(df, Admit ~ Gender + Dept, smoothing = 0)

# get the summary of the model
summary(model)

# make predictions
predictions <- predict(model, df)

# save and load the model
path <- "path/to/model"
write_ml(model, path)
savedModel <- read_ml(path)
summary(savedModel)

## End(Not run)

danzafar/tidyspark documentation built on Sept. 30, 2020, 12:19 p.m.