View source: R/plot_concordance.R
plot_concordance | R Documentation |
This function plots the pairwise concordance between categorical features.
plot_concordance( dat, method = "chisq", alpha = 0.05, p_adj = NULL, sim_p = FALSE, B = 1999L, label = FALSE, diag = FALSE, pal_tiles = "PiRdBr", title = "Concordance Plot", legend = "right", hover = FALSE, export = FALSE )
dat |
A sample by feature data frame or matrix, e.g. of clinical variables or patient cluster assignments. All columns are converted to factors with a warning, if possible. |
method |
String specifying which measure of association to
compute. Currently supports |
alpha |
Optional significance threshold to impose on association
statistics. Those with p-values (optionally adjusted) less than or
equal to |
p_adj |
Optional p-value adjustment for multiple testing. Options
include |
sim_p |
Calculate p-values via Monte Carlo simulation? Only
relevant if |
B |
Number of replicates or permutations to sample when computing p-values. |
label |
Print association statistic over tiles? |
diag |
Include principal diagonal of the concordance matrix? Only
advisable if |
pal_tiles |
String specifying the color palette to use for heatmap
tiles. Options include the complete collection of |
title |
Optional plot title. |
legend |
Legend position. Must be one of |
hover |
Show association statistic by hovering mouse over the
corresponding tile or circle? If |
export |
Export concordance matrix? If |
Concordance plots visualize associations between categorical features. They are useful when evaluating the dependencies between clinical factors and/or patient clusters.
When method = "chisq"
, concordance is measured by the Pearson
chi-squared statistic. This test is based on several assumptions that may not
be met in practice (see Wikipedia for a quick
overview). When one or several of these assumptions are violated, more
accurate p-values can be estimated via Monte Carlo simulation with
B
replicates.
When method = "USP"
, concordance is measured by the negative logarithm
of the p-value of a U-statistic permutation test (Berrett et
al., 2021), which is minimax optimal under mild conditions. The test uses
B
permutations.
When method = "fisher"
, concordance is measured by the negative
logarithm of the test's p-value. If sim_p = FALSE
, then the
function will attempt to calculate an exact p-value. If this cannot be
executed in the available workspace, or if sim_p = TRUE
, then
p-values are estimated via Monte Carlo simulation with B
replicates.
When method = "MI"
, concordance is measured by the mutual information
statistic. If alpha
is non-NULL
, then p-values are
estimated via permutation testing with B
permutations.
If export = TRUE
, a list with up to two elements:
The concordance matrix, computed via the chosen method
.
The matrix of p-values (optionally adjusted), if alpha
is non-NULL
.
Berrett, T.B., Kontoyiannis, I. & Samworth, R. (2021). Optimal rates for independence testing via U-statistic permutation tests. Ann. Statist..
df <- data.frame(A = sample.int(2, 20, replace = TRUE), B = sample.int(3, 20, replace = TRUE), C = sample.int(4, 20, replace = TRUE)) plot_concordance(df)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.