slopeHeuristic: Slope heuristic

View source: R/slopeHeuristic.R

slopeHeuristicR Documentation

Slope heuristic

Description

Criterion to choose the number of clusters

Usage

slopeHeuristic(object, K0 = floor(max(object$nClass) * 0.4))

Arguments

object

output of mixtCompLearn

K0

number of class for computing the constant value (see details)

Details

The slope heuristic criterion is: LL_k - 2 C * D_k, with LL_k the loglikelihood for k classes, D_k the number of free parameters for k classes, C is the slope of the linear regression between D_k and LL_k for (k> K0)

Value

the values of the slope heuristic

Author(s)

Quentin Grimonprez

References

Cathy Maugis, Bertrand Michel. Slope heuristics for variable selection and clustering via Gaussian mixtures. [Research Report] RR-6550, INRIA. 2008. inria-00284620v2

Jean-Patrick Baudry, Cathy Maugis, Bertrand Michel. Slope Heuristics: Overview and Implementation. 2010. hal-00461639

Examples


data(titanic)

## Use the MixtComp format
dat <- titanic

# refactor categorical data: survived, sex, embarked and pclass
dat$sex <- refactorCategorical(dat$sex, c("male", "female", NA), c(1, 2, "?"))
dat$embarked <- refactorCategorical(dat$embarked, c("C", "Q", "S", NA), c(1, 2, 3, "?"))
dat$survived <- refactorCategorical(dat$survived, c(0, 1, NA), c(1, 2, "?"))
dat$pclass <- refactorCategorical(dat$pclass, c("1st", "2nd", "3rd"), c(1, 2, 3))

# replace all NA by ?
dat[is.na(dat)] <- "?"

# create model
model <- list(
  pclass = "Multinomial",
  survived = "Multinomial",
  sex = "Multinomial",
  age = "Gaussian",
  sibsp = "Poisson",
  parch = "Poisson",
  fare = "Gaussian",
  embarked = "Multinomial"
)

# create algo
algo <- createAlgo()

# run clustering
resLearn <- mixtCompLearn(dat, model, algo, nClass = 2:25, criterion = "ICL", nRun = 3, nCore = 1)

out <- slopeHeuristic(resLearn, K0 = 6)



RMixtComp documentation built on July 9, 2023, 6:06 p.m.