Mallows: Fits a Mallows mixture model to ranking data

Description Usage Arguments Value Author(s) References See Also Examples

Description

Fits the Mallows mixture model to total rankings, using EM algorithm, for clustering permutations.

Usage

1
2
3
4
5
6
7
Mallows(datas, G, weights = NULL, iter = 100, iterin = iter,
  tol = 0.001, logsumexp.trick = TRUE, seed = 47631439,
  key = c("copelandMallows", "bruteMallows", "bordaMallows", "kernelMallows",
  "kernelMallows_Exh", "kernelGaussian", "copelandMallows_Eqlam",
  "bruteMallows_Eqlam", "bordaMallows_Eqlam", "kernelMallows_Eqlam",
  "kernelMallows_Exh_Eqlam", "kernelGaussian_Eqlam"), exhkey = "_Exh",
  eqlamkey = "_Eqlam")

Arguments

datas

Matrix of dimension N x n with sequences in rows.

G

Number of modes, 2 or greater.

weights

Numeric vector of length N denoting frequencies of each permutation observed. Each observation is observed once by default. Notably it must not contain 0 and should be of equal length with nrow(datas).

iter

Maximum number of iterations for EM algorithm.

iterin

Maximum number of iterations for alternate optimization between centers and lambda. Effective only when performing kernel Mallows with exhaustive optimization.

tol

Stopping precision.

logsumexp.trick

Logical. Whether or not to use log-sum-exp trick to compute log-likelihood.

seed

Seed index for reproducible results when optimization is performed. Set to NULL to disable the action.

key

A character string defining the type of Mallows mixture model to perform:

  • copelandMallows denotes original Mallows mixture model with cluster centers found by Copeland's method

  • bruteMallows denotes original Mallows mixture model with cluster centers found by brute-force search for optimal Kemeny consensus (not applicable for large n)

  • bordaMallows denotes original Mallows mixture model with cluster centers as Borda count

  • kernelMallows denotes kernel version of Mallows mixture model with cluster centers as the barycenter in Euclidean space induced by Kendall embedding

  • kernelGaussian denotes Gaussian mixture model in the Euclidean space induced by Kendall embedding

exhkey

DO NOT CHANGE. A character string. If it greps successfully in "key", an alternate optimization between centers and lambda. Effective only when performing "kernelMallows" with exhaustive optimization.

eqlamkey

DO NOT CHANGE. A character string. If it greps successfully in "key", the dispersion parameters (or lambda) are constrained to be equal for all clusters; otherwise no constraints on lambda.

Value

List.

key

Character string indicating the type of Mallows mixture model performed

R

List of length "G" of cluster centers, each entry being a permutation of length "n" if original Mallows mixture model is performed, or a numeric vector of length choose(n,2) if kernel version is performed

p

Numeric vector of length "G" representing the proportion probability of each cluster

lambda

Numeric vector of length "G" representing the dispersion parameters of each cluster

datas

A copy of "datas" on which the Mallows mixture model is fitted, combined with "weights", fuzzy assignment membership probability "z", distances to centers in "R"

min.like

Numeric vector of length "iter" representing fitted likelihood values at each iteration

Author(s)

Yunlong Jiao

References

Thomas Brendan Murphy, Donal Martin. "Mixtures of distance-based models for ranking data." Computational Statistics & Data Analysis, vol. 41, no. 3, pp. 645-655, 2003. DOI:10.1016/S0167-9473(02)00165-2

Yunlong Jiao, Jean-Philippe Vert. "The Kendall and Mallows Kernels for Permutations." IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), vol. 40, no. 7, pp. 1755-1769, 2018. DOI:10.1109/TPAMI.2017.2719680

See Also

MallowsCV

Examples

1
2
3
4
5
6
7
datas <- do.call('rbind', combinat::permn(1:5))
G <- 3
weights <- runif(nrow(datas))

# Fit Mallows mixture model
model <- Mallows(datas, G, weights, key = 'bordaMallows')
str(model)

YunlongJiao/kernrank documentation built on May 10, 2019, 1:13 a.m.