Create an instance of [ClusterStrategy] class

Share:

Description

A strategy is a multistage empirical process for finding a good estimate in the clustering estimation process.

clusterSemiSEMStrategy() create an instance of [ClusterStrategy] for users with many missing values uning a semiSem algorithm.

clusterSEMStrategy() create an instance of [ClusterStrategy] for users with many missing values using a SEM algorithm.

clusterFastStrategy() create an instance of [ClusterStrategy] for impatient user.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
clusterStrategy(nbTry = 1, nbInit = 5, initMethod = "class",
  initAlgo = "EM", nbInitIteration = 20, initEpsilon = 0.01,
  nbShortRun = 5, shortRunAlgo = "EM", nbShortIteration = 100,
  shortEpsilon = 1e-04, longRunAlgo = "EM", nbLongIteration = 1000,
  longEpsilon = 1e-07)

clusterSemiSEMStrategy()

clusterSEMStrategy()

clusterFastStrategy()

Arguments

nbTry

number of estimation to attempt.

nbInit

Integer defining the number of initialization to try. Default value: 3.

initMethod

Character string with the initialization method, see [clusterInit]$ for possible values. Default is "class".

initAlgo

Character string with the algorithm to use in the initialization stage, [clusterAlgo] for possible values. Default value: "EM".

nbInitIteration

Integer defining the maximal number of iterations in initialization algorithm if initAlgo = "EM", "CEM" or "SemiSEM". This is the number of iterations if initAlgo = "SEM". Default value: 20.

initEpsilon

Real defining the epsilon value for the algorithm. initEpsilon is not used by the SEM algorithm. Default value: 0.01.

nbShortRun

Integer defining the number of short run to try (the strategy launch an initialization before each short run). Default value: 5.

shortRunAlgo

A character string with the algorithm to use in the short run stage Default value: "EM".

nbShortIteration

Integer defining the maximal number of iterations in the short runs if shortRunAlgo = "EM", "CEM" or "semiSEM", or the number of iterations if shortRunAlgo = "SEM". Default value: 100.

shortEpsilon

Real defining the epsilon value for the algorithm. epsilon is not used by the SEM algorithm. Default value: 1e-04.

longRunAlgo

A character string with the algorithm to use in the long run stage Default value: "EM".

nbLongIteration

Integer defining the maximal number of iterations in the short runs if shortRunAlgo = "EM", "CEM" or "SemiSEM", or the number of iterations if shortRunAlgo = "SEM". Default value: 1000.

longEpsilon

Real defining the epsilon value for the algorithm. epsilon is not used by the SEM algorithm. Default value: 1e-07.

Details

A strategy is a way to find a good estimate of the parameters of a mixture model when using an EM algorithm or its variants. A “try” is composed of three stages

  • nbShortRun short iterations of the initialization step and of the EM, CEM, SEM or SemiSEM algorithm.

  • nbInit initializations using the [clusterInit] method.

  • A long run of the EM, CEM, SEM or SemiSEM algorithm.

For example if nbInit is 5 and nbShortRun is also 5, there will be 5 packets of 5 models initialized. In each packet, the best model will be ameliorated using a short run. Among the 5 models ameliorated one will be estimated until convergence using a long run. In total there were 25 initializations.

The whole process can be repeated at least nbTry times. If a try success, the estimated model is returned, otherwise an empty model is returned.

Value

a [ClusterStrategy] object

Author(s)

Serge Iovleff

Examples

1
2
3
4
5
6
7
8
9
   clusterStrategy()
   clusterStrategy(longRunAlgo= "CEM", nbLongIteration=100)
   clusterStrategy(nbTry = 1, nbInit= 1, shortRunAlgo= "SEM", nbShortIteration=100)

   clusterSemiSEMStrategy()

   clusterSEMStrategy()

   clusterFastStrategy()

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.