# pcc: Pearson's Contingency Coefficient In scrime: Analysis of High-Dimensional Categorical Data Such as SNP Data

## Description

Computes the values of (the corrected) Pearson's contingency coefficient for all pairs of rows of a matrix.

## Usage

 `1` ```pcc(x, dist = FALSE, corrected = TRUE, version = 1) ```

## Arguments

 `x` a numeric matrix consisting of integers between 1 and n.cat, where n.cat is the maximum number of levels a variable in `x` can take. `dist` should the distance based on Pearson's contingency coefficient be computed? For how this distance is computed, see `version`. `corrected` should Pearson's contingency coefficient be corrected such that it can take values between 0 and 1? If not corrected, it takes values between and 0 and sqrt((a - 1) / a), where a is the minimum of the numbers of levels that the respective two variables can take. Must be set to `TRUE`, if `dist = TRUE`. `version` a numeric value – either 1, 2, or 3 – specifying how the distance is computed. Ignored if `dist = FALSE`. If `1`, sqrt(1 - Cont^2) is computed, where Cont denotes Pearson's contigency coefficient. If 2, 1 - Cont is determined, and if 3, 1 - Cont^2 is returned.

## Value

A matrix with `nrow(x)` columns and rows containing the values of (or distances based on) the (corrected) Pearson's contigency coefficient for all pairs of rows of `x`.

## Author(s)

Holger Schwender, holger.schwender@udo.edu

`smc`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23``` ```## Not run: # Generate a data set consisting of 10 rows and 200 columns, # where the values are randomly drawn from the integers 1, 2, and 3. mat <- matrix(sample(3, 2000, TRUE), 10) # For each pair of rows of mat, the value of the corrected Pearson's # contingency coefficient is then obtained by out1 <- pcc(mat) out1 # and the distances based on this coefficient by out2 <- pcc(mat, dist = TRUE) out2 # Note that if version is set to 1 (default) in pcc, then all.equal(sqrt(1 - out1^2), out2) ## End(Not run) ```