selectK.R: Selection of the number K of clusters.

Description Usage Arguments Value Author(s) References See Also Examples

Description

Perform a selection of the number K of clusters for a given subset S of clustering variables.

Usage

1
2
3
4
selectK.R(xdata, S, Kmax, ploidy = 1, Kmin = 1,
  emOptions = list(epsi = 1e-05, nberSmallEM = 20, nberIterations = 15,
  nberMaxIterations = 5000, typeSmallEM = 0, typeEM = 0, putThreshold = FALSE),
  cte = 1, project = deparse(substitute(xdata)))

Arguments

xdata

A dataset in which data of each variable are in ploidy column(s).

S

A subset of clustering variables on the form of logical vector of the same length P as the number of variables in xdata.

Kmax

The maximum number of clusters to be explored.

ploidy

The number of occurrences for each variable in the data. For example, ploidy = 2 for genotype

Kmin

The minimum number of clusters to be explored. The default value is set to 1.

emOptions

A list of EM options (see EmOptions and setEmOptions).

cte

A double used for the selection criterion named CteDim in which the penalty function is pen(K,S)=cte*dim, where dim is the number of free parameters.

project

The name of the project. The default value is the name of the dataset.

Value

A list of estimated paramaters for each selection criteria.

Author(s)

Wilson Toussile

References

See Also

backward.explorer for more exploration of the competing models space, dimJump.R for data driven calibration of the penality function, and model.selection.R for model selection.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(genotype1)
head(genotype1)
genotype2 = cutEachCol(genotype1[, -11], ploidy = 2)
head(genotype2)
S = c(rep(TRUE, 8), rep(FALSE, 2))
## Not run: 
outPut = selectK.R(genotype2, S, Kmax = 6, ploidy = 2, Kmin=1)
outPut[["BIC"]]

file.remove("genotype2_ExploredModels.txt")

## End(Not run)

ClustMMDD documentation built on May 2, 2019, 2:44 p.m.