bicMSmix: BIC and AIC for mixtures of Mallows models with Spearman...

bicMSmixR Documentation

BIC and AIC for mixtures of Mallows models with Spearman distance

Description

bicMSmix and aicMSmix compute, respectively, the Bayesian Information Criterion (BIC) and the Akaike Information Criterion (AIC) for a mixture of Mallows models with Spearman distance fitted on partial rankings.

Usage

bicMSmix(rho, theta, weights, rankings)

aicMSmix(rho, theta, weights, rankings)

Arguments

rho

Integer G\timesn matrix with the component-specific consensus rankings in each row.

theta

Numeric vector of G non-negative component-specific precision parameters.

weights

Numeric vector of G positive mixture weights (normalization is not necessary).

rankings

Integer N\timesn matrix or data frame with partial rankings in each row. Missing positions must be coded as NA.

Details

The (log-)likelihood evaluation is performed by augmenting the partial rankings with the set of all compatible full rankings (see data_augmentation), and then the marginal likelihood is computed.

When n\leq 20, the (log-)likelihood is exactly computed, otherwise it is approximated with the method introduced by Crispino et al. (2023). If n>170, the approximation is also restricted over a fixed grid of values for the Spearman distance to limit computational burden.

Value

The BIC or AIC value.

References

Crispino M, Mollica C and Modugno L (2025+). MSmix: An R Package for clustering partial rankings via mixtures of Mallows Models with Spearman distance. (submitted)

Crispino M, Mollica C, Astuti V and Tardella L (2023). Efficient and accurate inference for mixtures of Mallows models with Spearman distance. Statistics and Computing, 33(98), DOI: 10.1007/s11222-023-10266-8.

Schwarz G (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), pages 461–464, DOI: 10.1002/sim.6224.

Sakamoto Y, Ishiguro M, and Kitagawa G (1986). Akaike Information Criterion Statistics. Dordrecht, The Netherlands: D. Reidel Publishing Company.

See Also

likMSmix, data_augmentation

Examples


## Example 1. Simulate rankings from a 2-component mixture of Mallows models
## with Spearman distance.
set.seed(12345)
rank_sim <- rMSmix(sample_size = 50, n_items = 12, n_clust = 2)
str(rank_sim)
rankings <- rank_sim$samples
# Fit the true model.
set.seed(12345)
fit <- fitMSmix(rankings = rankings, n_clust = 2, n_start = 10)
# Comparing the BIC at the true parameter values and at the MLE.
bicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
bicMSmix(rho = fit$mod$rho, theta = fit$mod$theta, weights = fit$mod$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = fit$mod$rho, theta = fit$mod$theta, weights = fit$mod$weights,
       rankings = rank_sim$samples)


## Example 2. Simulate rankings from a basic Mallows model with Spearman distance.
set.seed(54321)
rank_sim <- rMSmix(sample_size = 50, n_items = 8, n_clust = 1)
str(rank_sim)
# Let us censor the observations to be top-5 rankings.
rank_sim$samples[rank_sim$samples > 5] <- NA
rankings <- rank_sim$samples
# Fit the true model with the two EM algorithms.
set.seed(54321)
fit_em <- fitMSmix(rankings = rankings, n_clust = 1, n_start = 10)
set.seed(54321)
fit_mcem <- fitMSmix(rankings = rankings, n_clust = 1, n_start = 10, mc_em = TRUE)
# Compare the BIC at the true parameter values and at the MLEs.
bicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
bicMSmix(rho = fit_em$mod$rho, theta = fit_em$mod$theta, weights = fit_em$mod$weights,
       rankings = rank_sim$samples)
bicMSmix(rho = fit_mcem$mod$rho, theta = fit_mcem$mod$theta, weights = fit_mcem$mod$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = rank_sim$rho, theta = rank_sim$theta, weights = rank_sim$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = fit_em$mod$rho, theta = fit_em$mod$theta, weights = fit_em$mod$weights,
       rankings = rank_sim$samples)
aicMSmix(rho = fit_mcem$mod$rho, theta = fit_mcem$mod$theta, weights = fit_mcem$mod$weights,
       rankings = rank_sim$samples)


MSmix documentation built on April 3, 2025, 9:29 p.m.