Combination of Factorial Methods and Cluster Analysis

Description

Performs the factorial analysis of the data and a cluster analysis using the nfcl first factorial coordinates

Usage

1
2
3
4
5
6
FactoClass( dfact, metodo, dfilu = NULL , nf = 2, nfcl = 10, k.clust = 3, 
            scanFC = TRUE , n.max = 5000 , n.clus = 1000 ,sign = 2.0,
            conso=TRUE , n.indi = 25,row.w = rep(1, nrow(dfact)) )
## S3 method for class 'FactoClass'
print(x, ...)
analisis.clus(X,W)

Arguments

dfact

object of class data.frame, with the data of active variables.

metodo

function of ade4 for ade4 factorial analysis, dudi.pca,Principal Component Analysis; dudi.coa, Correspondence Analysis; witwit.coa, Internal Correspondence Analysis; dudi.acm, Multiple Correspondence Analysis ...

dfilu

ilustrative variables (default NULL)

nf

number of axes to use into the factorial analysis (default 2)

nfcl

number of axes to use in the classification (default 10)

k.clust

number of classes to work (default 3)

scanFC

if is TRUE, it asks in the console the values nf, nfcl y k.clust

n.max

when rowname(dfact)>=n.max, k-means is performed previous to hierarchical clustering (default 5000)

n.clus

when rowname(fact)>=n.max, the previous k-means is performed with n.clus groups (default 1000)

sign

threshold test value to show the characteristic variables and modalities

conso

when conso is TRUE, the process of consolidating the classification is performed (default TRUE)

n.indi

number of indices to draw in the histogram (default 25)

row.w

vector containing the row weights if metodo<>dudi.coa

x

object of class FactoClass

...

further arguments passed to or from other methods

X

coordinates of the elements of a class

W

weights of the elements of a class

Details

Lebart et al. (1995) present a strategy to analyze a data table using multivariate methods, consisting of an intial factorial analysis according to the nature of the compiled data, followed by the performance of mixed clustering. The mixed clustering combines hierarchic clustering using the Ward's method with K-means clustering. Finally a partition of the data set and the characterization of each one of the classes is obtained, according to the active and illustrative variables, being quantitative, qualitative or frequency.

FactoClass is a function that connects procedures of the package ade4 to perform the analysis factorial of the data and from stats for the cluster analysis.

The function analisis.clus calculates the geometric characteristics of each class: size, inertia, weight and square distance to the origin.

For impression in LaTeX format see FactoClass.tex

To draw factorial planes with cluster see plotFactoClass

Value

object of class FactoClass with the following:

dudi

object of class dudi from ade4 with the specifications of the factorial analysis

nfcl

number of axes selected for the classification

k

number of classes

indices

table of indices obtained through WARD method

cor.clus

coordinates of the clusters

clus.summ

summary of the clusters

cluster

vector indicating the cluster of each element

carac.cate

cluster characterization by qualitative variables

carac.cont

cluster characterization by quantitative variables

carac.frec

cluster characterization by frequency active variables

Author(s)

Pedro Cesar del Campo pcdelcampon@unal.edu.co, Campo Elias Pardo cepardot@unal.edu.co http://www.docentes.unal.edu.co/cepardot, Ivan Diaz ildiazm@unal.edu.co, Mauricio Sadinle msadinleg@unal.edu.co

References

Lebart, L. and Morineau, A. and Piron, M. (1995) Statisitique exploratoire multidimensionnelle, Paris.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Cluster analysis with Correspondence Analysis
data(ColorAdjective)
FC.col <-FactoClass(ColorAdjective, dudi.coa)
6
10
5

FC.col

FC.col$dudi


# Cluster analysis with Multiple Correspondence Analysis
data(BreedsDogs)

BD.act <- BreedsDogs[-7]  # active variables
BD.ilu <- BreedsDogs[7]   # ilustrative variables

FC.bd <-FactoClass( BD.act, dudi.acm, k.clust = 4,
                       scanFC = FALSE, dfilu = BD.ilu, nfcl = 10)

FC.bd

FC.bd$clus.summ
FC.bd$indices