In mhahsler/seriation: Infrastructure for Ordering Objects Using Seriation

pkg <- 'seriation'

source("https://raw.githubusercontent.com/mhahsler/pkg_helpers/main/pkg_helpers.R")
pkg_title(pkg, anaconda = "r-seriation", stackoverflow = "seriation+r")

Introduction

Seriation arranges a set of objects into a linear order given available data with the goal of revealing structural information. This package provides the infrastructure for ordering objects with an implementation of many seriation/ordination techniques to reorder data matrices, dissimilarity matrices, correlation matrices, and dendrograms (see below for a complete list). The package provides several visualizations (grid and ggplot2) to reveal structural information, including permuted image plots, reordered heatmaps, Bertin plots, clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT).

Here are some quick guides on applications of seriation:

Implemented seriation methods and criteria:

pkg_usage(pkg)
pkg_citation(pkg, 2L)

Available seriation methods to reorder dissimilarity data

Seriation methods for dissimilarity data reorder the set of objects in the data. The methods fall into several groups based on the criteria they try to optimize, constraints (like dendrograms), and the algorithmic approach.

Dendrogram leaf order

These methods create a dendrogram using hierarchical clustering and then derive the seriation order from the leaf order in the dendrogram. Leaf reordering may be applied.

DendSer - Dendrogram seriation heuristic to optimize various criteria
GW - Hierarchical clustering reordered by the Gruvaeus and Wainer heuristic
HC - Hierarchical clustering (single link, avg. link, complete link)
OLO - Hierarchical clustering with optimal leaf ordering

Dimensionality reduction

Find a seriation order by reducing the dimensionality to 1 dimension. This is typically done by minimizing a stress measure or the reconstruction error.

MDS - classical metric multidimensional scaling
MDS_angle - order by the angular order in the 2D MDS projection space split by the larges gap
isoMDS - 1D Krusakl's non-metric multidimensional scaling
isomap - 1D isometric feature mapping ordination
monoMDS - order along 1D global and local non-metric multidimensional scaling using monotone regression (NMDS)
metaMDS - 1D non-metric multidimensional scaling (NMDS) with stable solution from random starts
Sammon - Order along the 1D Sammon's non-linear mapping
smacof - 1D MDS using majorization (ratio MDS, interval MDS, ordinal MDS)
TSNE - Order along the 1D t-distributed stochastic neighbor embedding (t-SNE)
UMAP - Order along the 1D embedding produced by uniform manifold approximation and projection

Optimization

These methods try to optimize a seriation criterion directly, typically using a heuristic approach.

ARSA - optimize the linear seriation criterion using simulated annealing
Branch-and-bound to minimize the unweighted/weighted column gradient
GA - Genetic algorithm with warm start to optimize any seriation criteria
GSA - General simulated annealing to optimize any seriation criteria
SGD - stochastic gradient descent to find a local optimum given an initial order and a seriation criterion.
QAP - Quadratic assignment problem heuristic (optimizes 2-SUM, linear seriation, inertia, banded anti-Robinson form)
Spectral seriation to optimize the 2-SUM criterion (unnormalized, normalized)
TSP - Traveling salesperson solver to minimize the Hamiltonian path length

Other Methods

Identity permutation
OPTICS - Order of ordering points to identify the clustering structure
R2E - Rank-two ellipse seriation
Random permutation
Reverse order
SPIN - Sorting points into neighborhoods (neighborhood algorithm, side-to-site algorithm)
VAT - Order of the visual assessment of clustering tendency

A detailed comparison of the most popular methods is available in the paper An experimental comparison of seriation methods for one-mode two-way data. (read the preprint).

Available seriation methods to reorder data matrices, count tables, and data.frames

For matrices, rows and columns are reordered.

Seriating rows and columns simultaneously

Row and column order influence each other.

BEA - Bond Energy Algorithm to maximize the measure of effectiveness (ME)
BEA_TSP - TSP to optimize the measure of effectiveness
BK_unconstrained - Algorithm by Brower and Kyle (1988) to arrange binary matrices.
CA - calculates a correspondence analysis of a matrix of frequencies (count table) and reorders according to the scores on a correspondence analysis dimension

Seriating rows and columns separately using dissimilarities

Heatmap - reorders rows and columns independently by calculating row/column distances and then applying a seriation method for dissimilarities (see above)

Seriate rows in a data matrix

These methods need access to the data matrix instead of dissimilarities to reorder objects (rows). The same approach can be applied to columns.

PCA_angle - order by the angular order in the 2D PCA projection space split by the larges gap
LLE reorder along a 1D locally linear embedding
Means - reorders using row means
PCA - orders along the first principal component
TSNE - Order along the 1D t-distributed stochastic neighbor embedding (t-SNE)
UMAP - Order along the 1D embedding produced by uniform manifold approximation and projection

Other methods

AOE - order by the angular order of the first two eigenvectors for correlation matrices.
Identity permutation
Random permutation
Reverse order

pkg_install(pkg)

Usage

The used example dataset contains the joint probability of disagreement between Supreme Court Judges from 1995 to 2002. The goal is to reveal structural information in this data. We load the library, read the data, convert the data to a distance matrix, and then use the default seriation method to reorder the objects.

library(seriation)
data("SupremeCourt")

d <- as.dist(SupremeCourt)
d

order <- seriate(d)
order

Here is the resulting permutation vector.

get_order(order)

Next, we visualize the original and permuted distance matrix.

pimage(d, main = "Judges (original alphabetical order)")
pimage(d, order, main = "Judges (reordered by seriation)")

Darker squares around the main diagonal indicate groups of similar objects. After seriation, two groups are visible.

We can compare the available seriation criteria. Seriation improves all measures. Note that some measures are merit measures while others represent cost. See the manual page for details.

rbind(
 alphabetical = criterion(d),
 seriated = criterion(d, order)
)

Some seriation methods also return a linear configuration where more similar objects are located closer to each other.

get_config(order)

plot_config(order)

We can see a clear divide between the two groups in the configuration.

References

Michael Hahsler, Kurt Hornik and Christian Buchta, Getting Things in Order: An Introduction to the R Package seriation, Journal of Statistical Software, 25(3), 2008. DOI: 10.18637/jss.v025.i03
Michael Hahsler. An experimental comparison of seriation methods for one-mode two-way data. European Journal of Operational Research, 257:133-143, 2017. DOI: 10.1016/j.ejor.2016.08.066 (read the preprint)
Hahsler, M. and Hornik, K. (2011): Dissimilarity plots: A visual exploration tool for partitional clustering. Journal of Computational and Graphical Statistics, 10(2):335–354. doi:10.1198/jcgs.2010.09139 (read the preprint; code examples)
Reference manual for package seriation.

mhahsler/seriation documentation built on June 15, 2025, 9:44 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mhahsler/seriation
Infrastructure for Ordering Objects Using Seriation

In mhahsler/seriation: Infrastructure for Ordering Objects Using Seriation

Introduction

Available seriation methods to reorder dissimilarity data

Dendrogram leaf order

Dimensionality reduction

Optimization

Other Methods

Available seriation methods to reorder data matrices, count tables, and data.frames

Seriating rows and columns simultaneously

Seriating rows and columns separately using dissimilarities

Seriate rows in a data matrix

Other methods

Usage

References

R Package Documentation

Browse R Packages

We want your feedback!

mhahsler/seriation Infrastructure for Ordering Objects Using Seriation

In mhahsler/seriation: Infrastructure for Ordering Objects Using Seriation

Introduction

Available seriation methods to reorder dissimilarity data

Dendrogram leaf order

Dimensionality reduction

Optimization

Other Methods

Available seriation methods to reorder data matrices, count tables, and data.frames

Seriating rows and columns simultaneously

Seriating rows and columns separately using dissimilarities

Seriate rows in a data matrix

Other methods

Usage

References

R Package Documentation

Browse R Packages

We want your feedback!

mhahsler/seriation
Infrastructure for Ordering Objects Using Seriation