# lsbclust: Least-squares Bilinear Clustering of Three-way Data In lsbclust: Least-Squares Bilinear Clustering for Three-Way Data

## Description

This function clusters along one way of a three-way array (as specified by `margin`) while decomposing along the other two dimensions. Four types of clusterings are allowed based on the respective two-way slices of the array: on the overall means, row margins, column margins and the interactions between rows and columns. Which clusterings can be fit is determined by the vector `delta`, with four binary elements. All orthogonal models are fitted. The nonorthogonal case `delta = (1, 1, 0, 0)` returns an error. See the reference for further details.

## Usage

 ```1 2 3 4``` ```lsbclust(data, margin = 3L, delta = c(1L, 1L, 1L, 1L), nclust, ndim = 2L, fixed = c("none", "rows", "columns"), nstart = 20L, starts = NULL, nstart.kmeans = 500L, alpha = 0.5, parallel = FALSE, maxit = 100L, verbose = 1, method = "diag", type = NULL, sep.nclust = TRUE, ...) ```

## Arguments

 `data` A three-way array representing the data. `margin` An integer giving the single subscript of `data` over which the clustering will be applied. `delta` A four-element binary vector (logical or numeric) indicating which sum-to-zero constraints must be enforced. `nclust` A vector of length four giving the number of clusters for the overall mean, the row margins, the column margins and the interactions (in that order) respectively. Alternatively, a vector of length one, in which case all components will have the same number of clusters. `ndim` The required rank for the approximation of the interactions (a scalar). `fixed` One of `"none"`, `"rows"` or `"columns"` indicating whether to fix neither sets of coordinates, or whether to fix the row or column coordinates across clusters respectively. If a vector is supplied, only the first element will be used (passed to `int.lsbclust`). `nstart` The number of random starts to use for the interaction clustering. `starts` A list containing starting configurations for the cluster membership vector. If not supplied, random initializations will be generated (passed to `int.lsbclust`). `nstart.kmeans` The number of random starts to use in `kmeans`. `alpha` Numeric value in [0, 1] which determines how the singular values are distributed between rows and columns (passed to `int.lsbclust`). `parallel` Logical indicating whether to parallel over different starts or not (passed to `int.lsbclust`). `maxit` The maximum number of iterations allowed in the interaction clustering. `verbose` Integer controlling the amount of information printed: 0 = no information, 1 = Information on random starts and progress, and 2 = information is printed after each iteration for the interaction clustering. `method` The method for calculating cluster agreement across random starts, passed on to `cl_agreement` (passed to `int.lsbclust`). `type` One of `"rows"`, `"columns"` or `"overall"` (or a unique abbreviation of one of these) indicating whether clustering should be done on row margins, column margins or the overall means of the two-way slices respectively. If more than one opion are supplied, the algorithm is run for all (unique) options supplied (passed to `orc.lsbclust`). This is an optional argument. `sep.nclust` Logical indicating how nclust should be used across different `type`'s. If `sep.nclust` is `TRUE`, `nclust` is recycled so that each `type` can have a different number of clusters. If `sep.nclust` is `FALSE`, the same vector `nclust` is used for all `type`'s. `...` Additional arguments passed to `kmeans`.

## Value

Returns an object of S3 class `lsbclust` which has slots:

 `overall` Object of class `ovl.kmeans` for the overall means clustering `rows` Object of class `row.kmeans` for the row means clustering `columns` Object of class `col.kmeans` for the column means clustering `interactions` Object of class `int.lsbclust` for the interaction clustering `call` The function call used to create the object `delta` The value of `delta` in the fit `df` Breakdown of the degrees-of-freedom across the different subproblems `loss` Breakdown of the loss across subproblems `time` Time taken in seconds to calculate the solution `cluster` Matrix of cluster membership per observation for all cluster types

## References

Schoonees, P.C., Groenen, P.J.F., Van de Velden, M. Least-squares Bilinear Clustering of Three-way Data. Econometric Institute Report, EI2014-23.

`int.lsbclust`, `orc.lsbclust`