seriate_best: Best Seriation

View source: R/seriate_best.R

seriate_bestR Documentation

Best Seriation

Description

Often the best seriation method for a particular dataset is not know and heuristics may produce unstable results. seriate_best() and seriate_rep() automatically try different seriation methods or rerun randomized methods several times to find the best and order given a criterion measure. seriate_improve() uses a local improvement strategy to imporve an existing solution.

Usage

seriate_best(
  x,
  methods = NULL,
  control = NULL,
  criterion = NULL,
  rep = 10L,
  parallel = TRUE,
  verbose = TRUE,
  ...
)

seriate_rep(
  x,
  method = NULL,
  control = NULL,
  criterion = NULL,
  rep = 10L,
  parallel = TRUE,
  verbose = TRUE,
  ...
)

seriate_improve(
  x,
  order,
  criterion = NULL,
  control = NULL,
  verbose = TRUE,
  ...
)

Arguments

x

the data.

methods

a vector of character string with the name of the seriation methods to try.

control

a list of control options passed on to seriate(). For seriate_best() control needs to be a named list of control lists with the names matching the seriation methods.

criterion

seriate_rep() chooses the criterion specified for the method in the registry. A character string with the criterion to optimize can be specified.

rep

number of times to repeat the randomized seriation algorithm.

parallel

logical; perform replications in parallel. Uses foreach::foreach() if a ⁠%dopar%⁠ backend (e.g., doParallel::doParallel) is registered.

verbose

logical; show progress and results for different methods

...

further arguments are passed on to the seriate().

method

a character string with the name of the seriation method (default: varies by data type).

order

a ser_permutation object for x or the name of a seriation method to start with.

Details

seriate_rep() rerun a randomized seriation methods to find the best solution given the criterion specified for the method in the registry. A specific criterion can also be specified. Non-stochastic methods are automatically only run once.

seriate_best() runs a set of methods and returns the best result given a criterion. Stochastic methods are automatically randomly restarted several times.

seriate_improve() improves a seriation order using simulated annealing using a specified criterion measure. It uses seriate() with method "GSA", a reduced probability to accept bad moves, and a lower minimum temperature. Control parameters for this method are accepted.

Criterion

If no criterion is specified, ten the criterion specified for the method in the registry (see ⁠[get_seriation_method()]⁠) is used. For methods with no criterion in the registry (marked as "other"), a default method is used. The defaults are:

  • dist: "AR_deviations" - the study in Hahsler (2007) has shown that this criterion has high similarity with most other criteria.

  • matrix: "Moore_stress"

Parallel Execution

Some methods support for parallel execution is provided using the foreach package. To use parallel execution, a suitable backend needs to be registered (see the Examples section for using the doParallel backend).

Value

Returns an object of class ser_permutation.

Author(s)

Michael Hahsler

References

Hahsler, M. (2017): An experimental comparison of seriation methods for one-mode two-way data. European Journal of Operational Research, 257, 133–143. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ejor.2016.08.066")}

See Also

Other seriation: register_DendSer(), register_GA(), register_optics(), register_smacof(), register_tsne(), register_umap(), registry_for_seriaiton_methods, seriate()

Examples

data(SupremeCourt)
d_supreme <- as.dist(SupremeCourt)

# find best seriation order (tries by by default several fast methods)
o <- seriate_best(d_supreme, criterion = "AR_events")
o
pimage(d_supreme, o)

# run a randomized algorithms several times. It automatically chooses the
# LS criterion. Repetition information is returned as attributes
o <- seriate_rep(d_supreme, "QAP_LS", rep = 5)

attr(o, "criterion")
hist(attr(o, "criterion_distribution"))
pimage(d_supreme, o)

## Not run: 
# Using parallel execution on a larger dataset
data(iris)
m_iris <- as.matrix(iris[sample(seq(nrow(iris))),-5])
d_iris <- dist(m_iris)

library(doParallel)
registerDoParallel(cores = detectCores() - 1L)

# seriate rows of the iris data set
o <- seriate_best(d_iris, criterion = "LS")
o

pimage(d_iris, o)

# improve the order to minimize RGAR instead of LS
o_improved <- seriate_improve(d_iris, o, criterion = "RGAR")
pimage(d_iris, o_improved)

# available control parameters for seriate_improve()
get_seriation_method(name = "GSA")

## End(Not run)

seriation documentation built on Sept. 11, 2024, 7:33 p.m.