| predict.mbcfit | R Documentation |
This function predicts cluster assignments for new data based on an existing model of class mbcfit. The prediction leverages information from the fitted model to categorize new observations into clusters.
## S3 method for class 'mbcfit'
predict(object, newdata, ...)
object |
An object of class |
newdata |
A numeric vector, matrix, or data frame of observations. Rows correspond to observations and columns correspond to variables/features. Categorical variables and |
... |
Further arguments passed to or from other methods. |
The predict.mbcfit function utilizes the parameters of a previously fitted mbcfit model to allocate new data points to estimated clusters. The function performs necessary checks to ensure the mbcfit model returns valid estimates and the dimensionality of the new data aligns with the model.
The mbcfit object must contain a component named params, which is itself a list containing the following necessary elements, for a mixture model with K components:
proportionsA numeric vector of length K, with elements summing to 1, representing cluster proportions.
meanA numeric matrix of dimensions c(P, K), representing cluster centers.
covA numeric array of dimensions c(P, P, K), representing cluster covariance matrices.
Data dimensionality is P, and new data dimensionality must match (ncol(data) must be equal to P) or otherwise the function terminates with an error message.
The predicted clustering is obtained as the MAP estimator using posterior weights of a Gaussian mixture model parametrized at params.
Denoting with z(x) the predicted cluster label for point x, and with \phi the (multivariate) Gaussian density:
z(x) = \underset{k=\{1,\ldots,K\}}{\arg\,\max} \frac{\pi_k\phi(x, \mu_k, \Sigma_k)}{\sum_{j=1}^K \pi_j\phi(x, \mu_j, \Sigma_j)}
A vector of length nrow(data) containing the estimated cluster labels for each observation in the provided data.
Coraggio, Luca and Pietro Coretto (2023). Selecting the number of clusters, clustering models, and algorithms. A unifying approach based on the quadratic discriminant score. Journal of Multivariate Analysis, Vol. 196(105181), 1-20. doi: \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.jmva.2023.105181")}
gmix
# load data
data(banknote)
dat <- banknote[,-1]
# Estimate 3-components gaussian mixture model
set.seed(123)
res <- gmix(dat, K = 3)
# Cluster in output from gmix
print(res$cluster)
# Predict cluster on a single point
# (keep table dimension)
predict(res, dat[1, , drop=FALSE])
# Predict cluster on a subset
predict(res, dat[1:10, ])
# Predicted cluster on original dataset are equal to the clustering from the gmix model
all(predict(res, dat) == res$cluster)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.