discriminatory_crit: Selection of ICS components based on discriminatory power

View source: R/select_crit.R

discriminatory_critR Documentation

Selection of ICS components based on discriminatory power

Description

Identifies invariant coordinates associated to the highest discriminatory power. Currently, the implemented measure is "eta2" as quantified by the Wilks' partial eta-squared, computed using the heplots::etasq() function.

Usage

discriminatory_crit(object, ...)

## S3 method for class 'ICS'
discriminatory_crit(
  object,
  clusters,
  method = "eta2",
  nb_select = NULL,
  select_only = FALSE,
  ...
)

## Default S3 method:
discriminatory_crit(
  object,
  clusters,
  method = "eta2",
  nb_select = NULL,
  select_only = FALSE,
  gen_kurtosis = NULL,
  ...
)

Arguments

object

dataframe or object of class "ICS".

...

additional arguments are currently ignored.

clusters

a vector of the same length as the number of observations, indicating the true clusters. It is used to compute the discriminatory power based on it.

method

the name of the discriminatory power. Only "eta2" is implemented.

nb_select

the exact number of components to select. By default it is set to NULL, i.e the number of components to select is the number of clusters minus one.

select_only

boolean. If TRUE only the vector names of the selected invariant components are returned. If FALSE additional details are returned.

gen_kurtosis

vector of generalized kurtosis values.

Details

The discriminatory power is evaluated for each combination of the first and/or last combinations of nb_select components. The combination achieving the highest discriminatory power is selected.

More specifically, we compute \eta^{2} = 1 - \Lambda^{1/s}, where \Lambda denotes Wilks' lambda:

\Lambda = \frac{\det(E)}{\det(T)},

where E is the within-group sum of squares and cross-products matrix, H is the between-group sum of squares and cross-products matrix and T is the total sum of squares and cross-products matrix, with T = H + E, s=min(p, df_h) with p being the number of latent roots of HE^{-1}. See heplots::etasq() for more details.

Value

If select_only is TRUE a vector of the names of the invariant components or variables to select. If FALSE an object of class "ICS_crit" is returned with the following objects:

  • crit: the name of the criterion "discriminatory".

  • method: the name of the discriminatory power.

  • nb_select: the number of components to select.

  • select: the names of the invariant components or variables to select.

  • power_combinations: the discriminatory values for each of the considered combinations of nb_select components.

  • gen_kurtosis: the vector of generalized kurtosis values in case of ICS object.

Author(s)

Aurore Archimbaud and Anne Ruiz-Gazen

References

Alfons, A., Archimbaud, A., Nordhausen, K., & Ruiz-Gazen, A. (2024). Tandem clustering with invariant coordinate selection. Econometrics and Statistics. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1016/j.ecosta.2024.03.002")}.

Muller, K. E. and Peterson, B. L. (1984). Practical methods for computing power in testing the Multivariate General Linear Hypothesis Computational Statistics and Data Analysis, 2, 143-158.

See Also

normal_crit(), med_crit(), var_crit(), heplots::etasq().

Examples

X <- iris[,-5]
out <- ICS(X)
discriminatory_crit(out, clusters = iris[,5], select_only = FALSE)

ICSClust documentation built on Aug. 8, 2025, 7:43 p.m.