| MSmix-package | R Documentation |
The MSmix package provides functions to fit and analyze finite
Mixtures of Mallows models with Spearman distance (a.k.a. \theta-model)
for full and partial rankings with arbitrary missing positions.
Inference is conducted within the maximum likelihood (ML) framework via EM algorithms.
Estimation uncertainty is tackled via diverse versions of bootstrapped and asymptotic confidence intervals.
The Mallows model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. However, inference for this model is challenging due to the intractability of the normalizing constant, also referred to as partition function. The present package performs ML estimation (MLE) of the Mallows model with Spearman distance from full and partial rankings with arbitrary censoring patterns. Thanks to the novel approximation of the model normalizing constant introduced by Crispino, Mollica, Astuti and Tardella (2023), as well as the existence of a closed-form expression of the MLE of the consensus ranking, MSmix can address inference even for a large number of items. The package also allows to account for unobserved sample heterogeneity through MLE of finite mixtures of Mallows models with Spearman distance via EM algorithms, in order to perform a model-based clustering of partial rankings into groups with similar preferences.
Computational efficiency is achieved with the use of a hybrid language, combining R and C++ code,
and the possibility of parallel computation.
In addition to inferential techniques, the package provides various functions for data manipulation, simulation, descriptive summary and model selection.
Specific S3 classes and methods are also supplied to enhance the usability and foster exchange with other packages.
The suite of functions available in the MSmix package is composed of:
Ranking data manipulation
data_conversionFrom rankings to orderings and vice versa.
data_censoringCensoring of full rankings.
data_completionDeterministic completion of partial rankings with full reference rankings.
data_augmentationGenerate all full rankings compatible with partial rankings.
Ranking data simulation
rMSmixRandom samples from finite mixtures of Mallows models with Spearman distance.
Ranking data description
data_descriptionDescriptive summaries for partial rankings.
Model estimation
fitMSmixMLE of mixtures of Mallows models with Spearman distance via EM algorithms.
likMSmixLikelihood evaluation for mixtures of Mallows models with Spearman distance.
Model selection
bicMSmixBIC value for the fitted mixture of Mallows models with Spearman distance.
aicMSmixAIC value for the fitted mixture of Mallows models with Spearman distance.
Estimation uncertainty
bootstrapMSmixBootstrap confidence intervals for mixtures of Mallows models with Spearman distance.
confintMSmixAsymptotic confidence intervals for mixtures of Mallows models with Spearman distance.
Spearman distance utilities
spear_distSpearman distance computation for full rankings.
spear_dist_distrSpearman distance distribution under the uniform (null) model.
partition_fun_spearPartition function of the Mallows model with Spearman distance.
expected_spear_distExpected Spearman distance under the Mallows model with Spearman distance.
var_spear_distVariance of the Spearman distance under the Mallows model with Spearman distance.
S3 class methods
print.bootMSmixPrint the bootstrap confidence intervals of mixtures of Mallows models with Spearman distance.
print.data_descrPrint the descriptive statistics for partial rankings.
print.emMSmixPrint the MLEs of mixtures of Mallows models with Spearman distance.
print.summary.emMSmixPrint the summary of the MLEs of mixtures of Mallows models with Spearman distance.
plot.bootMSmixPlot the bootstrap confidence intervals of mixtures of Mallows models with Spearman distance.
plot.data_descrPlot the descriptive statistics for partial rankings.
plot.distPlot the Spearman distance matrix for full rankings.
plot.emMSmixPlot the MLEs of mixtures of Mallows models with Spearman distance.
summary.emMSmixSummary of the MLEs of mixtures of Mallows models with Spearman distance.
Datasets
ranks_antifragilityAntifragility features of innovative startups (full rankings with covariates).
ranks_horrorArkham Horror data (full rankings).
ranks_beersBeers data (partial rankings with different censoring patterns and a covariate).
ranks_read_genresReading preference data (partial top-5 rankings with covariates).
ranks_sportsSport preferences and habits (full rankings with covariates).
Some quantities frequently recalled in the manual are the following:
NSample size.
nNumber of possible items.
GNumber of mixture components.
Data must be supplied as an integer N\timesn matrix with partial rankings in each row and missing positions denoted as NA (rank = 1 indicates the
most-liked item). Partial sequences with a single missing entry are
automatically filled in, as they correspond to full rankings. In the present setting, ties are not allowed.
Cristina Mollica, Marta Crispino, Lucia Modugno and Luca Tardella
Maintainer: Cristina Mollica <cristina.mollica@uniroma1.it>
Crispino M, Mollica C, Astuti V and Tardella L (2023). Efficient and accurate inference for mixtures of Mallows models with Spearman distance. Statistics and Computing, 33(98), DOI: 10.1007/s11222-023-10266-8.
Crispino M, Mollica C, Modugno L, Casadio Tarabusi E, and Tardella L (2024+). MSmix: An R Package for clustering partial rankings via mixtures of Mallows models with Spearman distance. (submitted).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.