cluscata: Perform a cluster analysis of blocks from a CATA experiment

View source: R/cluscata.R

cluscataR Documentation

Perform a cluster analysis of blocks from a CATA experiment

Description

Hierarchical clustering of blocks from a CATA experiment. Each cluster of blocks is associated with a compromise computed by the CATATIS method. The hierarchical clustering is followed by a partitioning algorithm (consolidation). Non-binary data are accepted.

Usage

cluscata(Data, nblo, NameBlocks=NULL, NameVar=NULL, Noise_cluster=FALSE,
        Itermax=30, Graph_dend=TRUE, Graph_bar=TRUE, printlevel=FALSE,
        gpmax=min(6, nblo-2), Testonlyoneclust=FALSE, alpha=0.05,
        nperm=50, Warnings=FALSE)

Arguments

Data

data frame or matrix where the blocks of binary variables are merged horizontally. If you have a different format, see change_cata_format

nblo

numerical. Number of blocks (subjects).

NameBlocks

string vector. Name of each block (subject). Length must be equal to the number of blocks. If NULL, the names are S1,...Sm. Default: NULL

NameVar

string vector. Name of each variable (attribute, the same names for each subject). Length must be equal to the number of attributes. If NULL, the colnames of the first block are taken. Default: NULL

Noise_cluster

logical. Should a noise cluster be computed? Default: FALSE

Itermax

numerical. Maximum of iteration for the partitioning algorithm. Default:30

Graph_dend

logical. Should the dendrogram be plotted? Default: TRUE

Graph_bar

logical. Should the barplot of the difference of the criterion and the barplot of the overall homogeneity at each merging step of the hierarchical algorithm be plotted? Default: TRUE

printlevel

logical. Print the number of remaining levels during the hierarchical clustering algorithm? Default: FALSE

gpmax

logical. What is maximum number of clusters to consider? Default: min(6, nblo-2)

Testonlyoneclust

logical. Test if there is more than one cluster? Default: FALSE

alpha

numerical between 0 and 1. What is the threshold to test if there is more than one cluster? Default: 0.05

nperm

numerical. How many permutations are required to test if there is more than one cluster? Default: 50

Warnings

logical. Display warnings about the fact that none of the subjects in some clusters checked an attribute or product? Default: FALSE

Value

Each partitionK contains a list for each number of clusters of the partition, K=1 to gpmax with:

  • group: the clustering partition after consolidation. If Noise_cluster=TRUE, some subjects could be in the noise cluster ("K+1")

  • rho: the threshold for the noise cluster

  • homogeneity: homogeneity index (

  • s_with_compromise: similarity coefficient of each subject with its cluster compromise

  • weights: weight associated with each subject in its cluster

  • compromise: the compromise of each cluster

  • CA: list. the correspondance analysis results on each cluster compromise (coordinates, contributions...)

  • inertia: percentage of total variance explained by each axis of the CA for each cluster

  • s_all_cluster: the similarity coefficient between each subject and each cluster compromise

  • criterion: the CLUSCATA criterion error

  • param: parameters called

  • type: parameter passed to other functions

There is also at the end of the list:

  • dend: The CLUSCATA dendrogram

  • cutree_k: the partition obtained by cutting the dendrogram in K clusters (before consolidation).

  • overall_homogeneity_ng: percentage of overall homogeneity by number of clusters before consolidation (and after if there is no noise cluster)

  • diff_crit_ng: variation of criterion when a merging is done before consolidation (and after if there is no noise cluster)

  • test_one_cluster: decision and pvalue to know if there is more than one cluster

  • param: parameters called

  • type: parameter passed to other functions

References

Llobell, F., Cariou, V., Vigneau, E., Labenne, A., & Qannari, E. M. (2019). A new approach for the analysis of data and the clustering of subjects in a CATA experiment. Food Quality and Preference, 72, 31-39.
Llobell, F., Giacalone, D., Labenne, A., Qannari, E.M. (2019). Assessment of the agreement and cluster analysis of the respondents in a CATA experiment. Food Quality and Preference, 77, 184-190.

See Also

plot.cluscata, summary.cluscata , catatis, cluscata_kmeans, change_cata_format, change_cata_format2

Examples


data(straw)
#with 40 subjects
res=cluscata(Data=straw[,1:(16*40)], nblo=40)
#plot(res, ngroups=3, Graph_dend=FALSE)
summary(res, ngroups=3)
#With noise cluster
res2=cluscata(Data=straw[,1:(16*40)], nblo=40, Noise_cluster=TRUE,
Graph_dend=FALSE, Graph_bar=FALSE)
#with all subjects
res=cluscata(Data=straw, nblo=114, printlevel=TRUE)


#Vertical format
data("fish")
Data=fish[1:66,2:30]
chang2=change_cata_format2(Data, nprod= 6, nattr= 27, nsub = 11, nsess= 1)
res3=cluscata(Data= chang2$Datafinal, nblo = 11, NameBlocks =  chang2$NameSub)



ClustBlock documentation built on Aug. 30, 2023, 5:08 p.m.