correlation | R Documentation |
Computes various correlation coefficients between a specified response variable and each of the remaining variables in a given data frame or tibble. The available correlation methods are Pearson's product-moment correlation (parametric), Spearman's rank correlation, Kendall's tau correlation (non-parametric), Chatterjee's new correlation coefficient, and the biweight midcorrelation (a robust correlation measure).
correlation(
x,
var,
method = "pearson",
plot = FALSE,
color = "#111D71",
interactive = FALSE
)
x |
A data frame or tibble containing the variables of interest. |
var |
A character string specifying the name of the response variable. |
method |
A character string indicating the correlation method to use. Allowed values are "pearson", "spearman", "kendall", "chatterjee", or "bicor" (for biweight midcorrelation). The default is "pearson". |
plot |
A logical value indicating whether to produce a visualization of the correlations. Default is FALSE (no plot). |
color |
A character string specifying the color to use for the plot. Default is "#111D71". |
interactive |
A logical value indicating whether to create an interactive plot using plotly. Default is FALSE (static ggplot2 plot). |
The Pearson correlation coefficient measures the linear relationship between two continuous variables and is suitable when the data follows a bivariate normal distribution. The Spearman and Kendall correlations are non-parametric measures of monotonic association, making them suitable for non-linear relationships and when the data deviates from normality. The Chatterjee correlation coefficient is a recently proposed measure that aims to address some limitations of existing correlation coefficients, particularly for heavy-tailed distributions and in the presence of outliers. The biweight midcorrelation is a robust correlation measure that downweights the influence of outliers and is recommended when the data contains extreme values or deviates significantly from normality.
A list containing:
correlation
: A tibble with columns for the variable name, correlation value, and method used.
plot
: If plot = TRUE
, a ggplot2
object (or a plotly
object if interactive = TRUE
).
Christian L. Goueguel
Chatterjee, S. (2021). A new coefficient of correlation. Journal of the American Statistical Association, 116(536):2009-2022.
Wilcox, R. (2012). Introduction to robust estimation and hypothesis testing (3rd ed.). Academic Press. (ISBN 978-0123869838).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.