clustord-package | R Documentation |
Biclustering, row clustering and column clustering using the proportional odds model (POM), ordered stereotype model (OSM) or binary model for ordinal categorical data.
The clustord package provides six functions: clustord()
, rerun()
,
mat2df()
, calc.SE.rowcluster()
, calc.SE.bicluster()
, and
calc.cluster.comparisons()
.
The main function is clustord()
, which
fits a clustering model to the data. The model is fitted using
likelihood-based clustering via the EM algorithm. The package assumes that
you started with a data matrix of responses, though you will need to
convert that data matrix into a long-form data frame before running
clustord
. Every element in the original data matrix becomes one
row in the data frame, and the row and column indices from the data matrix
become the columns ROW and COL in the data frame. You can perform
clustering on rows or columns of the data matrix, or biclustering on both
rows and columns simultaneously. You can include any number of covariates
for rows and covariates for columns. Ordinal models used in the package are
Ordered Stereotype Model (OSM), Proportional Odds Model (POM) and a
dedicated Binary Model for binary data.
The rerun()
function is useful for continuing clustering runs that did
not converge on the first attempt, and for running new clustering runs using
the estimated parameters of a previous run as a starting point. The main
input for this function is a clustord
object output by clustord
,
and internally the rerun
function runs clustord
, after setting
up all the input parameters based on the original model fitting run.#'
mat2df()
is a utility function provided to convert a data matrix of
responses into the long-form data frame format required by
clustord()
, and can also attach any covariates to that long-form
data frame if needed.
calc.SE.rowcluster()
and calc.SE.bicluster()
are functions to
run after running clustord()
, to calculate the standard errors on
the parameters fitted using clustord()
.
calc.cluster.comparisons()
can be used to compare the assigned cluster
memberships of the rows or columns of the data matrix from two different
clustering fits, in a way that avoids the label-switching problem.
Maintainer: Louise McMillan louise.mcmillan@vuw.ac.nz (ORCID) [copyright holder]
Authors:
Daniel Fernández Martínez daniel.fernandez.martinez@upc.edu (ORCID)
Ying Cui ying.cui@sms.vuw.ac.nz
Eleni Matechou e.matechou@kent.ac.uk (ORCID)
Other contributors:
W. N. Venables (clustord osm regression functions and S3 methods derived by Louise McMillan from MASS package polr function by Venables and Ripley) [contributor, copyright holder]
B. D. Ripley (clustord osm regression functions and S3 methods derived by Louise McMillan from MASS package polr function by Venables and Ripley) [contributor, copyright holder]
Useful links:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.