View source: R/compute_mallows_mixtures.R
compute_mallows_mixtures | R Documentation |
Convenience function for computing Mallows models with varying numbers of mixtures. This is useful for deciding the number of mixtures to use in the final model.
compute_mallows_mixtures(n_clusters, ..., cl = NULL)
n_clusters |
Integer vector specifying the number of clusters to use. |
... |
Other named arguments, passed to |
cl |
Optional computing cluster used for parallelization, returned
from |
A list of Mallows models of class BayesMallowsMixtures
, with one element
for each number of mixtures that
was computed. This object can be studied with plot_elbow
.
Other modeling:
compute_mallows()
,
smc_mallows_new_item_rank()
,
smc_mallows_new_users()
# DETERMINING THE NUMBER OF CLUSTERS IN THE SUSHI EXAMPLE DATA
## Not run:
# Let us look at any number of clusters from 1 to 10
# We use the convenience function compute_mallows_mixtures
n_clusters <- seq(from = 1, to = 10)
models <- compute_mallows_mixtures(n_clusters = n_clusters,
rankings = sushi_rankings,
include_wcd = TRUE)
# models is a list in which each element is an object of class BayesMallows,
# returned from compute_mallows
# We can create an elbow plot
plot_elbow(models, burnin = 1000)
# We then select the number of cluster at a point where this plot has
# an "elbow", e.g., n_clusters = 5.
# Having chosen the number of clusters, we can now study the final model
# Rerun with 5 clusters
mixture_model <- compute_mallows(rankings = sushi_rankings, n_clusters = 5,
include_wcd = TRUE)
# Delete the models object to free some memory
rm(models)
# Set the burnin
mixture_model$burnin <- 1000
# Plot the posterior distributions of alpha per cluster
plot(mixture_model)
# Compute the posterior interval of alpha per cluster
compute_posterior_intervals(mixture_model, parameter = "alpha")
# Plot the posterior distributions of cluster probabilities
plot(mixture_model, parameter = "cluster_probs")
# Plot the posterior probability of cluster assignment
plot(mixture_model, parameter = "cluster_assignment")
# Plot the posterior distribution of "tuna roll" in each cluster
plot(mixture_model, parameter = "rho", items = "tuna roll")
# Compute the cluster-wise CP consensus, and show one column per cluster
cp <- compute_consensus(mixture_model, type = "CP")
cp$cumprob <- NULL
stats::reshape(cp, direction = "wide", idvar = "ranking",
timevar = "cluster", varying = list(as.character(unique(cp$cluster))))
# Compute the MAP consensus, and show one column per cluster
map <- compute_consensus(mixture_model, type = "MAP")
map$probability <- NULL
stats::reshape(map, direction = "wide", idvar = "map_ranking",
timevar = "cluster", varying = list(as.character(unique(map$cluster))))
# RUNNING IN PARALLEL
# Computing Mallows models with different number of mixtures in parallel leads to
# considerably speedup
library(parallel)
cl <- makeCluster(detectCores() - 1)
n_clusters <- seq(from = 1, to = 10)
models <- compute_mallows_mixtures(n_clusters = n_clusters,
rankings = sushi_rankings,
include_wcd = TRUE, cl = cl)
stopCluster(cl)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.