# SFS_sfs: Similarity-First Search multisweep algorithm In SFS: Similarity-First Search Seriation Algorithm

## Description

Return a ranking of the objects such that similar objects are ordered close to each other. If the matrix is Robinsonian, then the ranking returned is a Robinson ordering.

## Usage

 `1` ```sfs(matrix, sfs_epsilon = 0, dissimilarity = FALSE, Robinsonian = FALSE, num_sweeps = 4) ```

## Arguments

 `matrix` a 3-columns `data frame` with no repeated symmetric entries, representing the list of all similarities (or dissimilarities) (i, j, A_{ij}) between the pairs of objects to reorder. `sfs_epsilon` a numerical value which determines that two entries whose difference is below this threshold are considered to be equal. `dissimilarity` a boolean value equal to `TRUE` if the input data is a dissimilarity. `Robinsonian` a boolean value equal to `TRUE` if one wants to recognize a Robinsonian matrix. `num_sweeps` an integer value that determines how many iterations of SFS shall be performed.

## Details

Given a a 3-columns `data frame` (i, j, A_{ij}) listing all the similarities (or dissimilarities) among the objects, this function builds a `spMat` object in Armadillo and computes a finite number of repeated SFS iterations (each called a sweep). The user may decide the threshold for which two entries are considered equal, meaning that if |A_{ij} - A_{ik}| ≤q `sfs_epsilon`, then objects j and k have the same similarity (or dissimilarity) with respect to object i. By default, this threshold is set to 0.
If not specified, the `matrix` represents the similarity information between objects. If `dissimilarity = TRUE`, then the `matrix` represents the dissimilarity information and the SFS algorithm is modified by sorting the neighborhood of a visited vertex for increasing values (instead of for decreasing values).
The parameter k=`num_sweeps` sets the number of sweeps performed by `SFS()`. This number directly affects the complexity of the function since, as each sweep runs in (k(n+m\log n)) time, `SFS()` runs in (k(n+m \log n)) time. By default, `num_sweeps=4`, as it is known that three sweeps suffice for recognizing Robinonian binary matrices and for general matrices experiments show that four sweeps are enough for finding a good ranking for most data. If `Robinsonian = TRUE`, then the number of sweeps is automatically set to the number of objects n to rank minus one. In this case, `sfs()` also checks if the returned permutation is a Robinson ordering (since it is known that if the order returned after n-1 sweeps is not a Robinson ordering then the data is not Robinsonian). Efficient measures are implemented in order to avoid unnecessary time consuming loops between consecutive SFS iterations. Note that checking if a given permutation is a Robinson ordering is implemented at the moment only when dealing with similarities among the objects.
Finally, the object returned by `SFS()` is a vector of integers, where the entry at position i represents the i-th object in the ranking. If the `matrix` is 0-based, then the returned vector is 0-based. If `matrix` is 1-based, then the returned vector is 1-based.

## Value

Return a (row) vector of integers representing the ranking of the objects, which is 0-based or 1-based accordingly with the input `matrix`.

## Author(s)

Matteo Seminaroti (SFS) and Utz-Uwe Haus (R wrapping)

## References

M. Laurent and M. Seminaroti. Similarity-First Search: a new algorithm with application to Robinsonian matrix recognition. SIAM Journal on Discrete Mathematics (to appear). arXiv:1601.03521. 2016.

M. Seminaroti. Combinatorial Algorithms for the Seriation Problem. PhD thesis. School of Economics and Management, Tilburg University, pages 1–209. 2016.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66``` ``` ## install package in R #install.packages("SFS_0.1.tar.gz") #install.packages("seriation") ## load package in R library(SFS) ## invoke SFS on a R Matrix mtxt <- c("11 2 9 0 5 0 5 5 2 0 5 0 5 6 0 0 2 0 5", "2 11 2 0 9 0 8 5 10 0 5 0 5 2 0 0 10 0 8", "9 2 11 0 5 0 5 5 2 0 5 0 5 10 0 0 2 0 5", "0 0 0 11 0 3 0 0 0 3 0 3 0 0 10 3 0 9 0", "5 9 5 0 11 0 8 7 9 0 7 0 7 5 0 0 9 0 10", "0 0 0 3 0 11 0 0 0 10 0 6 0 0 5 8 0 5 0", "5 8 5 0 8 0 11 7 8 0 7 0 7 5 0 0 8 0 9", "5 5 5 0 7 0 7 11 6 0 10 0 8 7 0 0 6 0 7", "2 10 2 0 9 0 8 6 11 0 6 0 5 2 0 0 10 0 8", "0 0 0 3 0 10 0 0 0 11 0 6 0 0 4 9 0 5 0", "5 5 5 0 7 0 7 10 6 0 11 0 9 7 0 0 6 0 7", "0 0 0 3 0 6 0 0 0 6 0 11 0 0 9 6 0 10 0", "5 5 5 0 7 0 7 8 5 0 9 0 11 7 0 0 5 0 7", "6 2 10 0 5 0 5 7 2 0 7 0 7 11 0 0 2 0 5", "0 0 0 10 0 5 0 0 0 4 0 9 0 0 11 4 0 10 0", "0 0 0 3 0 8 0 0 0 9 0 6 0 0 4 11 0 4 0", "2 10 2 0 9 0 8 6 10 0 6 0 5 2 0 0 11 0 8", "0 0 0 9 0 5 0 0 0 5 0 10 0 0 10 4 0 11 0", "5 8 5 0 10 0 9 7 8 0 7 0 7 5 0 0 8 0 11") M <- as.matrix(read.table(textConnection(mtxt))) A <- SFS::read(M) SFS::sfs(A, Robinsonian = TRUE) ## invoke SFS on a data-frame with 3-column data-frames with 0-based vertices, with ## (row, col, value) triples containing symmetric values data <- c("0 0 1.0", "1 0 1.5", "2 0 1.9", "0 1 2.0", "1 1 2.5", "2 1 2.9", "0 2 3.0", "1 2 3.5", "2 2 3.9") M <- read.table(textConnection(data)) A <- SFS::read(M, identical_val = TRUE) SFS::sfs(A) ## invoke SFS on a \code{dist} from seriation package: library(seriation) data("iris"); x <- as.matrix(iris[-5]); x <- x[sample(1:nrow(x)),]; M <- dist(x) D <- SFS::read(M) SFS::sfs(D, sfs_epsilon = 0.01, dissimilarity = TRUE) ## invoke SFS reading from file a Robinsonian matrix M <- read.table(system.file("extdata", "list_130.txt", package = "SFS")) A <- SFS::read(M, identical_val = TRUE) SFS::sfs(A, Robinsonian = TRUE) ## invoke SFS reading from file containing 3-columns (row, col, value) entries ## of an asymmetric non-Robinsonian similarity matrix with 1-based vertices M <- read.table(system.file("extdata", "list_130.txt", package = "SFS")) A <- SFS::read(M, identical_val = TRUE, symmetric = FALSE) SFS::sfs(A, num_sweeps = 7) ```

SFS documentation built on May 7, 2019, 9:01 a.m.