View source: R/metricsAnalysis.R
ATSC | R Documentation |
Automated Trimmed & Sparse Clustering.
This methods performs an optimal k value analysis with stabilityRange
, qualityRange
and getOptimalKValue
evaluomeR methods.
The optimal k
value is used to compute estimate a L1
bound and an alpha
trimming portion automatically
in order to perform an automatic trimmed and sparse clustering.
This posibily results in the input dataset being trimmed (either by columns, determined by L1
or
by rows, determined by alpha
).
Another optimal k value analysis is then executed over the trimmed dataset, to conclude with the an optimal partition.
ATSC(
data,
k.range = c(2, 15),
bs = 100,
cbi = "kmeans",
max_alpha = 0.1,
all_metrics = TRUE,
L1 = NULL,
alpha = NULL,
gold_standard = NULL,
seed = NULL
)
data |
A |
k.range |
Concatenation of two positive integers.
The first value |
bs |
Positive integer. Bootstrap value to perform the resampling. |
cbi |
Clusterboot interface name (default: "kmeans"):
"kmeans", "clara", "clara_pam", "hclust", "pamk", "pamk_pam", "pamk".
Any CBI appended with '_pam' makes use of |
max_alpha |
Maximum value of alpha, iterating over seq(0, max_alpha, 0.05) |
all_metrics |
Boolean. If true, clustering is performed upon all the dataset. |
L1 |
A single L1 bound on weights (the feature weights), see |
seed |
Positive integer. A seed for internal bootstrap. |
A list containing:
stab |
A data frame containing standardized stability. |
qual |
A data frame containing standardized quality. |
optimalK |
The optimal k value representing the optimal number of clusters determined from the initial analysis. |
stab_ATSC |
A data frame containing standardized stability after applying ATSC. |
qual_ATSC |
A data frame containing standardized quality applying ATSC. |
optimalK_ATSC |
The optimal k value representing the optimal number of clusters determined after applying ATSC. |
rskcOut |
An object returned by the RSKC function containing clustering results, including weights and trimmed observations. |
trimmedRows |
A vector of indices representing the rows that were trimmed from the dataset during the clustering process. |
trimmedColumns |
A vector of names representing the columns that were trimmed (i.e., removed) from the dataset due to zero weights. |
trimmedDataset |
A data frame containing the final processed dataset after trimming rows and columns. |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.