carbikeplot: Produces the carbike plot to find best relevant clustering...

View source: R/carbikeplot.R

carbikeplotR Documentation

Produces the carbike plot to find best relevant clustering solutions obtained by tclustICsol

Description

Takes as input the output of function tclustICsol (that is a structure containing the best relevant solutions) and produces the car-bike plot. This plot provides a concise summary of the best relevant solutions. This plot shows on the horizontal axis the value of c and on the vertical axis the value of k. For each solution we draw a rectangle for the interval of values for which the solution is best and stable and a horizontal line which departs from the rectangle for the values of c in which the solution is only stable. Finally, for the best value of c associated to the solution, we show a circle with two numbers, the first number indicates the ranked solution among those which are not spurious and the second one the ranked number including the spurious solutions. This plot has been baptized 'car-bike', because the first best solutions (in general 2 or 3) are generally best and stable for a large number of values of c and therefore will have large rectangles. In addition, these solutions are likely to be stable for additional values of c and therefore are likely to have horizontal lines departing from the rectangles (from here the name 'cars'). Finally, local minor solutions (which are associated with particular values of c and k) do not generally present rectangles or lines and are shown with circles (from here the name 'bikes').

Usage

carbikeplot(out, SpuriousSolutions = FALSE, trace = FALSE, ...)

Arguments

out

An S3 object of class tclusticsol.object, (output of tclustICsol) containing the relevant solutions.

SpuriousSolutions

Wheather to include or not spurious solutions. By default spurios solutions are not included into the plot.

trace

Whether to print intermediate results. Default is trace=FALSE.

...

potential further arguments passed to lower level functions.

Author(s)

FSDA team, valentin.todorov@chello.at

References

Cerioli, A., Garcia-Escudero, L.A., Mayo-Iscar, A. and Riani M. (2017). Finding the Number of Groups in Model-Based Clustering via Constrained Likelihoods, emphJournal of Computational and Graphical Statistics, pp. 404-416, https://doi.org/10.1080/10618600.2017.1390469.

Examples

 ## Not run: 

 ##  Car-bike plot for the geyser data ========================

 data(geyser2)
 out <- tclustIC(geyser2, whichIC="MIXMIX", plot=FALSE, alpha=0.1)

 ## Find the best solutions using as Information criterion MIXMIX
 print("Best solutions using MIXMIX")
 outMIXMIX <- tclustICsol(out, whichIC="MIXMIX", plot=FALSE, NumberOfBestSolutions=6)

 print(outMIXMIX$MIXMIXbs)

 carbikeplot(outMIXMIX)

 ##  Car-bike plot for the flea data ==========================

 data(flea)
 Y <- as.matrix(flea[, 1:(ncol(flea)-1)])    # select only the numeric variables
 rownames(Y) <- 1:nrow(Y)
 head(Y)

 out <- tclustIC(Y, whichIC="CLACLA", plot=FALSE, alpha=0.1, nsamp=100)

 ##  Find the best solutions using as Information criterion CLACLA
 print("Best solutions using CLACLA")
 outCLACLA <- tclustICsol(out,whichIC="CLACLA", plot=FALSE, NumberOfBestSolutions=66)
 ##  Produce the car-bike plot
 carbikeplot(outCLACLA)

 
## End(Not run)

fsdaR documentation built on March 31, 2023, 8:18 p.m.