Description Usage Arguments Value Note See Also Examples
Fits multivariate gaussian mixture model against a SparkDataFrame, similarly to R's
mvnormalmixEM(). Users can call summary
to print a summary of the fitted model,
predict
to make predictions on new data, and write.ml
/read.ml
to save/load fitted models.
1 2 3 4 5 6 7 8 9 10 11 12 13 | spark.gaussianMixture(data, formula, ...)
## S4 method for signature 'SparkDataFrame,formula'
spark.gaussianMixture(data, formula, k = 2, maxIter = 100, tol = 0.01)
## S4 method for signature 'GaussianMixtureModel'
summary(object)
## S4 method for signature 'GaussianMixtureModel'
predict(object, newData)
## S4 method for signature 'GaussianMixtureModel,character'
write.ml(object, path, overwrite = FALSE)
|
data |
a SparkDataFrame for training. |
formula |
a symbolic description of the model to be fitted. Currently only a few formula operators are supported, including '~', '.', ':', '+', and '-'. Note that the response variable of formula is empty in spark.gaussianMixture. |
... |
additional arguments passed to the method. |
k |
number of independent Gaussians in the mixture model. |
maxIter |
maximum iteration number. |
tol |
the convergence tolerance. |
object |
a fitted gaussian mixture model. |
newData |
a SparkDataFrame for testing. |
path |
the directory where the model is saved. |
overwrite |
overwrites or not if the output path already exists. Default is FALSE which means throw exception if the output path exists. |
spark.gaussianMixture
returns a fitted multivariate gaussian mixture model.
summary
returns summary of the fitted model, which is a list.
The list includes the model's lambda
(lambda), mu
(mu),
sigma
(sigma), loglik
(loglik), and posterior
(posterior).
predict
returns a SparkDataFrame containing predicted labels in a column named
"prediction".
spark.gaussianMixture since 2.1.0
summary(GaussianMixtureModel) since 2.1.0
predict(GaussianMixtureModel) since 2.1.0
write.ml(GaussianMixtureModel, character) since 2.1.0
mixtools: https://cran.r-project.org/package=mixtools
predict, read.ml, write.ml
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ## Not run:
sparkR.session()
library(mvtnorm)
set.seed(100)
a <- rmvnorm(4, c(0, 0))
b <- rmvnorm(6, c(3, 4))
data <- rbind(a, b)
df <- createDataFrame(as.data.frame(data))
model <- spark.gaussianMixture(df, ~ V1 + V2, k = 2)
summary(model)
# fitted values on training data
fitted <- predict(model, df)
head(select(fitted, "V1", "prediction"))
# save fitted model to input path
path <- "path/to/model"
write.ml(model, path)
# can also read back the saved model and print
savedModel <- read.ml(path)
summary(savedModel)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.