Description Usage Arguments Details Value Examples
View source: R/AlgorithmImplementation.R
This is a clustering algorithm for data where p << n. There are are four different types of penalty methods that can be used, depending on the size of the data and the accuracy. The first is the default method: the BIC penalty. There is also the AIC penalty, and full covariance. The full covariance method takes longer, but may give a more accurate implementation. Finally, there is also the mclust implementation, but that is not not recommended For all methods, a C_max variable is needed that is an upper limit on the possible number of clusters.
1 2 3 4 5 6 7 8 9 |
data |
Data input, must be in matrix form. Currently no support for missing values |
penalty |
A string of possible vectors. Options include: "bic", "aic", "full_covariance", "mclust" (default = "bic") |
C_max |
Maximum number of clusters to look for (default is 10) |
criterion |
Model of covariance structure (default = "VVI") |
n_bins |
Number of cuts if penalty = "scale" for the scaled RJ algorithm (default = sqrt(p)) |
seed |
Seed (defalt = 1) |
verbose |
Should progress be printed? (default = FALSE) |
All implementation except the mclust and full covariance method use C++ to increase runtime.
model_names controls the type of covariance structure. See Mclust Documenttion for more information. Note criterion "kmeans" is the same as "EEI". It is not suggested to use "kmeans" if it is suspected the classes are imbalanced
Returns RJ algorithm result for "aic", "bic" ("mclust" and "scale" will return an mclust object:
K | number of clusters found |
class | Class labels |
penalty | Penalty values at each iteraiton |
mean | Mean matrix |
prob | Probability values |
z | Z values from mclust (NULL penalty = "full_covariance") |
1 2 3 | X = simulate_HD_data()
X = X$X
clust = RJclust(X, penalty = "bic", C_max = 10)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.