mlcc.bic: Subspace clustering with automatic estimation of number of...

Description Usage Arguments Value Examples

View source: R/mlcc.bic.R

Description

Estimate the number of clusters according to the BIC. Basic k-means based Multiple Latent Components Clustering (MLCC) algorithm (mlcc.kmeans) is run a given number of times (numb.runs) for each number of clusters in numb.clusters. The best partition is choosen with BIC (see mlcc.reps function)

Usage

1
2
3
mlcc.bic(X, numb.clusters = 1:10, numb.runs = 20, stop.criterion = 1,
  max.iter = 20, max.dim = 4, scale = TRUE, numb.cores = NULL,
  greedy = TRUE, estimate.dimensions = TRUE, verbose = FALSE)

Arguments

X

a data frame or a matrix with only continuous variables

numb.clusters

a vector, numbers of clusters to be checked

numb.runs

an integer, number of runs of mlcc.kmeans

stop.criterion

an integer, if an iteration of mlcc.kmeans algorithm makes less changes in partitions than stop.criterion, mlcc.kmeans stops.

max.iter

an integer, maximum number of iterations of mlcc.kmeans algorithm

max.dim

an integer, if estimate.dimensions is FALSE then max.dim is dimension of each subspace. If estimate.dimensions is TRUE then subspaces dimensions are estimated from the range [1, max.dim]

scale

a boolean, if TRUE (value set by default) then variables in dataset are scaled to zero mean and unit variance

numb.cores

an integer, number of cores to be used, by default all cores are used

greedy

a boolean, if TRUE (value set by default) the clusters are estimated in a greedy way

estimate.dimensions

a boolean, if TRUE (value set by default) subspaces dimensions are estimated

verbose

a boolean, if TRUE plot with BIC values for different numbers of clusters is produced and values of BIC, computed for every number of clusters and subspaces dimensions, are printed (value set by default is FALSE)

Value

An object of class mlcc.fit consisting of

segmentation

a vector containing the partition of the variables

BIC

numeric, value of cluster.BIC criterion

subspacesDimensions

a list containing dimensions of the subspaces

nClusters

an integer, estimated number of clusters

factors

a list of matrices, basis for each subspace

all.fit

a list of segmentation, BIC, subspaces dimension for all numbers of clusters considered for an estimated subspace dimensions

all.fit.dims

a list of lists of segmentation, BIC, subspaces dimension for all numbers of clusters and subspaces dimensions considered

Examples

1
2
sim.data <- data.simulation(n = 100, SNR = 1, K = 5, numb.vars = 30, max.dim = 2)
mlcc.bic(sim.data$X, numb.clusters = 1:10, numb.runs = 20, verbose=TRUE)

psobczyk/public_varclust documentation built on May 26, 2019, 10:33 a.m.