correlation: Correlation Analysis

Description Usage Arguments Details Value References Examples

View source: R/correlation.R

Description

Performs a correlation analysis.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
correlation(
  data,
  data2 = NULL,
  method = "pearson",
  p_adjust = "holm",
  ci = 0.95,
  bayesian = FALSE,
  bayesian_prior = "medium",
  bayesian_ci_method = "hdi",
  bayesian_test = c("pd", "rope", "bf"),
  redundant = FALSE,
  include_factors = FALSE,
  partial = FALSE,
  partial_bayesian = FALSE,
  multilevel = FALSE,
  robust = FALSE,
  ...
)

Arguments

data

A data frame.

data2

An optional data frame.

method

A character string indicating which correlation coefficient is to be used for the test. One of "pearson" (default), "kendall", or "spearman", "biserial", "polychoric", "tetrachoric", "biweight", "distance", "percentage" (for percentage bend correlation), "blomqvist" (for Blomqvist's coefficient), "hoeffding" (for Hoeffding's D), "gamma", "gaussian" (for Gaussian Rank correlation) or "shepherd" (for Shepherd's Pi correlation). Setting "auto" will attempt at selecting the most relevant method (polychoric when ordinal factors involved, tetrachoric when dichotomous factors involved, point-biserial if one dichotomous and one continuous and pearson otherwise).

p_adjust

Correction method for frequentist correlations. Can be one of "holm" (default), "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr" or "none".

ci

Confidence/Credible Interval level. If "default", then it is set to 0.95 (95% CI).

bayesian

If TRUE, will run the correlations under a Bayesian framework. Note that for partial correlations, you will also need to set partial_bayesian to TRUE to obtain "full" Bayesian partial correlations. Otherwise, you will obtain pseudo-Bayesian partial correlations (i.e., Bayesian correlation based on frequentist partialization).

bayesian_prior

For the prior argument, several named values are recognized: "medium.narrow", "medium", "wide", and "ultrawide". These correspond to scale values of 1/sqrt(27), 1/3, 1/sqrt(3) and 1, respectively. See the BayesFactor::correlationBF function.

bayesian_ci_method

See arguments in model_parameters for BayesFactor tests.

bayesian_test

See arguments in model_parameters for BayesFactor tests.

redundant

Should the data include redundant rows (where each given correlation is repeated two times).

include_factors

If TRUE, the factors are kept and eventually converted to numeric or used as random effects (depending of multilevel). If FALSE, factors are removed upfront.

partial

Can be TRUE or "semi" for partial and semi-partial correlations, respectively.

partial_bayesian

If TRUE, will run the correlations under a Bayesian framework. Note that for partial correlations, you will also need to set partial_bayesian to TRUE to obtain "full" Bayesian partial correlations. Otherwise, you will obtain pseudo-Bayesian partial correlations (i.e., Bayesian correlation based on frequentist partialization).

multilevel

If TRUE, the factors are included as random factors. Else, if FALSE (default), they are included as fixed effects in the simple regression model.

robust

If TRUE, will rank-transform the variables prior to estimating the correlation. Note that, for instance, a Pearson's correlation on rank-transformed data is equivalent to a Spearman's rank correlation. Thus, using robust=TRUE and method="spearman" is redundant. Nonetheless, it is an easy way to increase the robustness of the correlation (as well as obtaining Bayesian Spearman rank Correlations).

...

Arguments passed to or from other methods.

Details

Correlation Types

Partial Correlation

Partial correlations are estimated as the correlation between two variables after adjusting for the (linear) effect of one or more other variable. The correlation test is then run after having partialized the dataset, independently from it. In other words, it considers partialization as an independent step generating a different dataset, rather than belonging to the same model. This is why some discrepancies are to be expected for the t- and p-values, CIs, BFs etc (but not the correlation coefficient) compared to other implementations (e.g., ppcor). (The size of these discrepancies depends on the number of covariates partialled-out and the strength of the linear association between all variables.) Such partial correlations can be represented as Gaussian Graphical Models (GGM), an increasingly popular tool in psychology. A GGM traditionally include a set of variables depicted as circles ("nodes"), and a set of lines that visualize relationships between them, which thickness represents the strength of association (see Bhushan et al., 2019).

Multilevel correlations are a special case of partial correlations where the variable to be adjusted for is a factor and is included as a random effect in a mixed model.

Notes

Value

A correlation object that can be displayed using the print, summary or table methods.

Multiple tests correction

About multiple tests corrections.

References

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
library(correlation)
results <- correlation(iris)

results
summary(results)
summary(results, redundant = TRUE)

# Grouped dataframe
if (require("dplyr")) {
  iris %>%
    group_by(Species) %>%
    correlation()
}

# automatic selection of correlation method
correlation(mtcars[-2], method = "auto")

correlation documentation built on Oct. 23, 2020, 6:27 p.m.