# phi: Effect size for contingency tables In effectsize: Indices of Effect Size and Standardized Parameters

## Description

Compute Cramer's V, phi (φ), Cohen's w (an alias of phi), Pearson's contingency coefficient, Odds ratios, Risk ratios, Cohen's h and Cohen's g for contingency tables or goodness-of-fit. See details.

## Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22``` ```phi(x, y = NULL, ci = 0.95, alternative = "greater", adjust = FALSE, ...) cohens_w(x, y = NULL, ci = 0.95, alternative = "greater", adjust = FALSE, ...) cramers_v(x, y = NULL, ci = 0.95, alternative = "greater", adjust = FALSE, ...) pearsons_c( x, y = NULL, ci = 0.95, alternative = "greater", adjust = FALSE, ... ) oddsratio(x, y = NULL, ci = 0.95, alternative = "two.sided", log = FALSE, ...) riskratio(x, y = NULL, ci = 0.95, alternative = "two.sided", log = FALSE, ...) cohens_h(x, y = NULL, ci = 0.95, alternative = "two.sided", ...) cohens_g(x, y = NULL, ci = 0.95, alternative = "two.sided", ...) ```

## Arguments

 `x` a numeric vector or matrix. `x` and `y` can also both be factors. `y` a numeric vector; ignored if `x` is a matrix. If `x` is a factor, `y` should be a factor of the same length. `ci` Confidence Interval (CI) level `alternative` a character string specifying the alternative hypothesis; Controls the type of CI returned: `"two.sided"` (two-sided CI; default for Cramer's V, phi (φ), and Cohen's w), `"greater"` (default for OR, RR, Cohen's h and Cohen's g) or `"less"` (one-sided CI). Partial matching is allowed (e.g., `"g"`, `"l"`, `"two"`...). See One-Sided CIs in effectsize_CIs. `adjust` Should the effect size be bias-corrected? Defaults to `FALSE`. `...` Arguments passed to `stats::chisq.test()`, such as `p`. Ignored for `cohens_g()`. `log` Take in or output the log of the ratio (such as in logistic models).

## Details

Cramer's V, phi (φ) and Pearson's C are effect sizes for tests of independence in 2D contingency tables. For 2-by-k tables, Cramer's V and phi are identical, and are equal to the simple correlation between two dichotomous variables, ranging between 0 (no dependence) and 1 (perfect dependence). For larger tables, Cramer's V or Pearson's C should be used, as they are bounded between 0-1, whereas phi can be larger than 1 (upper bound is `sqrt(min(nrow, ncol) - 1))`).

For goodness-of-fit in 1D tables Pearson's C or phi can be used. Phi has no upper bound (can be arbitrarily large, depending on the expected distribution), while Pearson's C is bounded between 0-1.

For 2-by-2 contingency tables, Odds ratios, Risk ratios and Cohen's h can also be estimated. Note that these are computed with each column representing the different groups, and the first column representing the treatment group and the second column baseline (or control). Effects are given as `treatment / control`. If you wish you use rows as groups you must pass a transposed table, or switch the `x` and `y` arguments.

Cohen's g is an effect size for dependent (paired) contingency tables ranging between 0 (perfect symmetry) and 0.5 (perfect asymmetry) (see `stats::mcnemar.test()`).

## Value

A data frame with the effect size (`Cramers_v`, `phi` (possibly with the suffix `_adjusted`), `Odds_ratio`, `Risk_ratio` (possibly with the prefix `log_`), `Cohens_h`, or `Cohens_g`) and its CIs (`CI_low` and `CI_high`).

## Confidence Intervals for Cohen's g, OR, RR and Cohen's h

For Cohen's g, confidence intervals are based on the proportion (P = g + 0.5) confidence intervals returned by `stats::prop.test()` (minus 0.5), which give a good close approximation.

For Odds ratios, Risk ratios and Cohen's h, confidence intervals are estimated using the standard normal parametric method (see Katz et al., 1978; Szumilas, 2010).

See Confidence (Compatibility) Intervals (CIs), CIs and Significance Tests, and One-Sided CIs sections for phi, Cohen's w, Cramer's V and Pearson's C.

## Confidence (Compatibility) Intervals (CIs)

Unless stated otherwise, confidence (compatibility) intervals (CIs) are estimated using the noncentrality parameter method (also called the "pivot method"). This method finds the noncentrality parameter ("ncp") of a noncentral t, F, or χ^2 distribution that places the observed t, F, or χ^2 test statistic at the desired probability point of the distribution. For example, if the observed t statistic is 2.0, with 50 degrees of freedom, for which cumulative noncentral t distribution is t = 2.0 the .025 quantile (answer: the noncentral t distribution with ncp = .04)? After estimating these confidence bounds on the ncp, they are converted into the effect size metric to obtain a confidence interval for the effect size (Steiger, 2004).

For additional details on estimation and troubleshooting, see effectsize_CIs.

## CIs and Significance Tests

"Confidence intervals on measures of effect size convey all the information in a hypothesis test, and more." (Steiger, 2004). Confidence (compatibility) intervals and p values are complementary summaries of parameter uncertainty given the observed data. A dichotomous hypothesis test could be performed with either a CI or a p value. The 100 (1 - α)% confidence interval contains all of the parameter values for which p > α for the current data and model. For example, a 95% confidence interval contains all of the values for which p > .05.

Note that a confidence interval including 0 does not indicate that the null (no effect) is true. Rather, it suggests that the observed data together with the model and its assumptions combined do not provided clear evidence against a parameter value of 0 (same as with any other value in the interval), with the level of this evidence defined by the chosen α level (Rafi & Greenland, 2020; Schweder & Hjort, 2016; Xie & Singh, 2013). To infer no effect, additional judgments about what parameter values are "close enough" to 0 to be negligible are needed ("equivalence testing"; Bauer & Kiesser, 1996).

## References

• Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd Ed.). New York: Routledge.

• Katz, D. J. S. M., Baptista, J., Azen, S. P., & Pike, M. C. (1978). Obtaining confidence intervals for the risk ratio in cohort studies. Biometrics, 469-474.

• Szumilas, M. (2010). Explaining odds ratios. Journal of the Canadian academy of child and adolescent psychiatry, 19(3), 227.

`chisq_to_phi()` for details regarding estimation and CIs.
Other effect size indices: `cohens_d()`, `effectsize()`, `eta_squared()`, `rank_biserial()`, `standardize_parameters()`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49``` ```M <- matrix(c(150, 100, 165, 130, 50, 65, 35, 10, 2, 55, 40, 25), nrow = 4, dimnames = list( Music = c("Pop", "Rock", "Jazz", "Classic"), Study = c("Psych", "Econ", "Law"))) M # Note that Phi is not bound to [0-1], but instead # the upper bound for phi is sqrt(min(nrow, ncol) - 1) phi(M) cramers_v(M) pearsons_c(M) ## 2-by-2 tables ## ------------- RCT <- matrix(c(71, 30, 50, 100), nrow = 2, byrow = TRUE, dimnames = list( Diagnosis = c("Sick", "Recovered"), Group = c("Treatment", "Control"))) RCT # note groups are COLUMNS oddsratio(RCT) oddsratio(RCT, alternative = "greater") riskratio(RCT) cohens_h(RCT) ## Dependent (Paired) Contingency Tables ## ------------------------------------- Performance <- matrix(c(794, 150, 86, 570), nrow = 2, dimnames = list( "1st Survey" = c("Approve", "Disapprove"), "2nd Survey" = c("Approve", "Disapprove"))) Performance cohens_g(Performance) ```