SerialClustering | R Documentation |
Performs consensus (weighted) clustering. The underlying algorithm (e.g.
hierarchical clustering) is run with different number of clusters nc
.
In consensus weighed clustering, weighted distances are calculated using the
cosa2
algorithm with different penalty parameters
Lambda
. The hyper-parameters are calibrated by maximisation of the
consensus score. This function uses a serial implementation and requires the grids of
hyper-parameters as input (for internal use only).
SerialClustering(
xdata,
nc,
eps,
Lambda,
K = 100,
tau = 0.5,
seed = 1,
n_cat = 3,
implementation = HierarchicalClustering,
scale = TRUE,
linkage = "complete",
row = TRUE,
output_data = FALSE,
verbose = TRUE,
...
)
xdata |
data matrix with observations as rows and variables as columns. |
nc |
matrix of parameters controlling the number of clusters in the
underlying algorithm specified in |
eps |
radius in density-based clustering, see
|
Lambda |
vector of penalty parameters for weighted distance calculation.
Only used for distance-based clustering, including for example
|
K |
number of resampling iterations. |
tau |
subsample size. |
seed |
value of the seed to initialise the random number generator and
ensure reproducibility of the results (see |
n_cat |
computation options for the stability score. Default is
|
implementation |
function to use for clustering. Possible functions
include |
scale |
logical indicating if the data should be scaled to ensure that all variables contribute equally to the clustering of the observations. |
linkage |
character string indicating the type of linkage used in
hierarchical clustering to define the stable clusters. Possible values
include |
row |
logical indicating if rows (if |
output_data |
logical indicating if the input datasets |
verbose |
logical indicating if a loading bar and messages should be printed. |
... |
additional parameters passed to the functions provided in
|
A list with:
Sc |
a matrix of the best stability scores for different (sets of) parameters controlling the number of clusters and penalisation of attribute weights. |
nc |
a matrix of numbers of clusters. |
Lambda |
a matrix of regularisation parameters for attribute weights. |
Q |
a matrix of the average number of selected attributes by the underlying algorithm with different regularisation parameters. |
coprop |
an array of consensus matrices. Rows and columns correspond to items. Indices along the third dimension correspond to different parameters controlling the number of clusters and penalisation of attribute weights. |
selprop |
an array of selection proportions. Columns correspond to attributes. Rows correspond to different parameters controlling the number of clusters and penalisation of attribute weights. |
method |
a list with |
params |
a list with values used for arguments
|
The rows of Sc
, nc
,
Lambda
, Q
, selprop
and indices along the third
dimension of coprop
are ordered in the same way and correspond to
parameter values stored in nc
and Lambda
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.