iac: Intrinsic Association Coefficient

View source: R/iac.R

iacR Documentation

Intrinsic Association Coefficient

Description

Compute the intrisic association coefficient of a table. This coefficient was first devised by Goodman (1996) as the “generalized contingency” when a logarithm link is used, and it is equal to the standard deviation of the log-linear two-way interaction parameters λ_{ij}. To obtain the Altham index, multiply the result by sqrt(nrow(tab) * ncol(tab)) * 2 (see “Examples” below).

Usage

iac(tab, cell = FALSE,
    weighting = c("marginal", "uniform", "none"),
    component = c("total", "symmetric", "antisymmetric"),
    shrink = FALSE,
    normalize = FALSE,
    row.weights = NULL, col.weights = NULL)

Arguments

tab

a two- or three-way table without zero cells; for three-way tables, average marginal weighting is used when “weighting = "marginal"”, and the MAOR is computed for each layer (third dimension).

cell

if “TRUE”, return the per-cell contributions (affected by the value of phi, see “Details” below).

weighting

what weights should be used when normalizing the scores.

component

whether to compute the total association, or from symmetric or antisymmetric interaction coefficients only.

shrink

whether to use the empirical Bayes shrinkage estimator proposed by Zhou (2015) rather than the direct estimator.

normalize

whether to return the normalized version of the index varying between 0 and 1 proposed by Bouchet-Valat (2022) rather than the classic index varying between 0 and positive infinity.

row.weights

optional custom weights to be used for rows, e.g. to compute the phi coefficient for several tables using their overall marginal distribution. If specified, weighting is ignored.

col.weights

see row.weights.

Details

See Goodman (1996), Equation 52 for the (marginal or other) weighted version of the intrinsic association coefficient (\tilde λ); the unweighted version can be computed with unit weights. The coefficient should not be confused with Goodman and Kruskal's lambda coefficient. The uniform-weighted version is defined as:

λ^\dagger = √{ \frac{1}{IJ} ∑_{i = 1}^I ∑_{j = 1}^J λ_{ij}^2 }

The (marginal or other) weighted version is defined as:

\tilde λ = √{ ∑_{i = 1}^I ∑_{j = 1}^J \tilde λ_{ij}^2 P_{i+} P_{+j} }

with ∑_{i = 1}^I λ_{ij} = ∑_{j = 1}^J λ_{ij} = 0 and ∑_{i = 1}^I P_{i+} \tilde λ_{ij} = ∑_{j = 1}^J P_{+j} \tilde λ_{ij} = 0.

The normalized version of the index is defined from λ^\dagger and \tilde λ as:

τ = √{1 + 1/(2 λ)^2} - 1/(2 λ)

Per-cell contributions c_{ij} are defined so that: \tilde φ = √{ ∑_{i = 1}^I ∑_{j = 1}^J c_{ij} }. For the unweighted case, c_{ij} = λ_{ij}^2 / IJ; for the weighted case, \tilde c_{ij} = \tilde λ_{ij}^2 P_{i+} P_{+j}.

This index cannot be computed in the presence of zero cells since it is based on the logarithm of proportions. In these cases, 0.5 is added to all cells of the table (Agresti 2002, sec. 9.8.7, p. 397; Berkson 1955), and a warning is printed. Make sure this correction does not affect too much the results (especially with small samples) by manually adding different values before calling this function.

Value

The numeric value of the intrinsic association coefficient (if cell = FALSE), or the corresponding per-cell contributions (if cell = TRUE).

Author(s)

Milan Bouchet-Valat

References

Agresti, A. 2002. Categorical Data Analysis. New York: Wiley.

Altham, P. M. E., Ferrie J. P., 2007. Comparing Contingency Tables Tools for Analyzing Data from Two Groups Cross-Classified by Two Characteristics. Historical Methods 40(1):3-16.

Bouchet-Valat, M. (2022). General Marginal-free Association Indices for Contingency Tables: From the Altham Index to the Intrinsic Association Coefficient. Sociological Methods & Research 51(1): 203-236.

Berkson, J. (1955). Maximum Likelihood and Minimum chi2 Estimates of the Logistic Function. J. of the Am. Stat. Ass. 50(269):130-162.

Goodman, L. A. (1996). A Single General Method for the Analysis of Cross-Classified Data: Reconciliation and Synthesis of Some Methods of Pearson, Yule, and Fisher, and Also Some Methods of Correspondence Analysis and Association Analysis. J. of the Am. Stat. Ass. 91(433):408-428.

Zhou, X. (2015). Shrinkage Estimation of Log-Odds Ratios for Comparing Mobility Tables. Sociological Methodology 45(1):33-63.

See Also

unidiff, rc, maor

Examples

  # Altham index (Altham and Ferrie, 2007, Table 1, p. 3 and commentary p. 8)
  tab1 <- matrix(c(260, 195, 158, 70,
                   715, 3245, 874, 664,
                   424, 454, 751, 246,
                   142, 247, 327, 228), 4, 4)
  iac(tab1, weighting="n") * sqrt(nrow(tab1) * ncol(tab1)) * 2

  # Zhou (2015)
  data(hg16)
  # Add 0.5 due to the presence of zero cells
  hg16 <- hg16 + 0.5
  # Figure 3, p. 343: left column then right column
  # (reported values are actually twice the Altham index)
  iac(hg16, weighting="n") * sqrt(nrow(hg16) * ncol(hg16)) * 2 * 2
  iac(hg16, weighting="n", shrink=TRUE) * sqrt(nrow(hg16) * ncol(hg16)) * 2 * 2
  # Table 4, p. 347: values are not exactly the same
  u <- unidiff(hg16)
  # First row
  cor(u$unidiff$layer$qvframe$estimate, iac(hg16, weighting="n"))
  cor(u$unidiff$layer$qvframe$estimate, iac(hg16, weighting="n"), method="spearman")
  # Second row
  cor(u$unidiff$layer$qvframe$estimate, iac(hg16, shrink=TRUE, weighting="n"))
  cor(u$unidiff$layer$qvframe$estimate, iac(hg16, shrink=TRUE, weighting="n"), method="spearman")

nalimilan/logmult documentation built on March 28, 2022, 1:18 p.m.