choose_question: Choice of the best binary question

Description Usage Arguments Details Value

Description

This function finds the best binary question do divide a cluster A into to subclusters such that the bipartition (A_l, A_l_c) has maximum between-clusters inertia.

Usage

1
2
choose_question(X, Z, indices, vec_quali = c(), w = rep(1/nrow(Z), nrow(Z)),
  D = rep(1, ncol(Z)), vec_order)

Arguments

X

the data matrix of dimension (nxp) where p is equal to the number of numerical variables plus the number of categories. This matrix is used to construct the binary questions

Z

the numerical data matrix of dimension (nxk) used to compute the inertia criterion (the matrix of the principal components for instance)

indices

vector of indices for the cluster A to divide.

vec_quali

vector containing the number of categories for each modalities (according to the categories observed in A_l)

w

weights vector

D

diagonal distance matrix coefficients

vec_order

vector containing TRUE if the categories of the variable are ordered

Details

This function works for both categorical, numerical and mixed data. This is the core function of the divclust algorithm. We are seeking the binary question which gives the best bipartition. A binary question is defined with a cutting variable (quantitative or qualitative), and a cutting value. For quantitative variable, the cutting value is a real number. For qualitative, the cutting value is

Value

inert

the between-clusters inertia of the bipartition (A_l, A_l_c)

A_l

the vector of indices of the cluster A_l

A_l

the vector of indices of the cluster A_l_c

cut_ind

the index of the cutting variables

cut_val

a list with :

  • $type: the type of the cutting variable (quantitative or qualitative)

  • $value: a real value if the variable is quantitative and a biparition of categories if the variable is qualitative. The first cluster of this bipartition of categories is given by $value$bipart whereas the second cluster is given by $value$bipart_c


chavent/divclust documentation built on May 13, 2019, 3:38 p.m.