RJclust: RJclust

Description Usage Arguments Details Value Examples

View source: R/AlgorithmImplementation.R

Description

This is a clustering algorithm for data where p << n. There are are four different types of penalty methods that can be used, depending on the size of the data and the accuracy. The first is the default method: the BIC penalty. There is also the AIC penalty, and full covariance. The full covariance method takes longer, but may give a more accurate implementation. Finally, there is also the mclust implementation, but that is not not recommended For all methods, a C_max variable is needed that is an upper limit on the possible number of clusters.

Usage

1
2
3
4
5
6
7
8
9
RJclust(
  data,
  penalty = "bic",
  C_max = 10,
  criterion = "VVI",
  n_bins = NULL,
  seed = 1,
  verbose = FALSE
)

Arguments

data

Data input, must be in matrix form. Currently no support for missing values

penalty

A string of possible vectors. Options include: "bic", "aic", "full_covariance", "mclust" (default = "bic")

C_max

Maximum number of clusters to look for (default is 10)

criterion

Model of covariance structure (default = "VVI")

n_bins

Number of cuts if penalty = "scale" for the scaled RJ algorithm (default = sqrt(p))

seed

Seed (defalt = 1)

verbose

Should progress be printed? (default = FALSE)

Details

All implementation except the mclust and full covariance method use C++ to increase runtime.

model_names controls the type of covariance structure. See Mclust Documenttion for more information. Note criterion "kmeans" is the same as "EEI". It is not suggested to use "kmeans" if it is suspected the classes are imbalanced

Value

Returns RJ algorithm result for "aic", "bic" ("mclust" and "scale" will return an mclust object:

K number of clusters found
class Class labels
penalty Penalty values at each iteraiton
mean Mean matrix
prob Probability values
z Z values from mclust (NULL penalty = "full_covariance")

Examples

1
2
3
X = simulate_HD_data()
X = X$X
clust = RJclust(X, penalty = "bic", C_max = 10)

rshudde/RJcluster documentation built on April 26, 2021, 5:21 p.m.