cor_long: Make a correlation matrix from long format data.

View source: R/cor_long.R

cor_longR Documentation

Make a correlation matrix from long format data.

Description

Make a correlation matrix from long format data.

Usage

cor_long(
  x,
  rows,
  cols,
  values,
  y = NULL,
  rows2 = NULL,
  cols2 = NULL,
  values2 = NULL,
  out_format = c("long", "wide"),
  method = "pearson",
  use = "everything",
  p_values = FALSE,
  p_adjust = "none",
  p_thresholds = c(`***` = 0.001, `**` = 0.01, `*` = 0.05, 1),
  p_sym_add = NULL,
  p_sym_digits = 2
)

Arguments

x

A long format data frame containing the data to correlate.

rows, cols

The columns in x containing the values that should be in the rows and columns of the correlation matrix.

values

Name of the column in x containing the values of the correlation matrix.

y

Optional second data frame for correlating with the data frame from x.

rows2, cols2

Optional names of columns with values for the rows and columns of a second matrix (taken from y).

values2

Optional column for the values of a second matrix.

out_format

Format of output correlation matrix ("long" or "wide").

method

Correlation method given to stats::cor().

use

Missing value strategy of stats::cor().

p_values

Logical indicating if p-values should be calculated.

p_adjust

String specifying the multiple testing adjustment method to use for the p-values (default is "none"). Passed to stats::p.adjust().

p_thresholds

Named numeric vector specifying p-value thresholds (in ascending order) to mark. The last element must be 1 or higher (to set the upper limit). Names must be unique, but one element can be left unnamed (by default 1 is unnamed, meaning values between the threshold closest to 1 and 1 are not marked in the plot). If NULL, no thresholding is done and p-value intervals are not marked with symbols.

p_sym_add

String with the name of the column to add to p-value symbols from p_thresholds (one of 'values', 'p_val', 'p_adj'). NULL (default) results in just the symbols.

p_sym_digits

Number of digits to use for the column in p_sym_add.

Details

If there is only one input data frame (x), a wide matrix is constructed from x and passed to stats::cor(), resulting in a correlation matrix with the column-column correlations.

If y is a data frame and rows2, cols2 and values2 are specified, the wide versions of x and y are correlated (stats::cor(wide_x, wide_y)) resulting in a correlation matrix with the columns of x in the rows and the columns of y in the columns.

Value

A correlation matrix (if wide format) or a long format data frame with the columns 'row', 'col', and 'value' (containing correlations).

Examples

set.seed(123)
cor_in <- data.frame(row = rep(letters[1:10], each = 5),
                     col = rep(LETTERS[1:5], 10),
                     val = rnorm(50))
# Wide format output (default)
corr_wide <- cor_long(cor_in, row, col, val)

# Long format output
corr_long <- cor_long(cor_in, row, col, val,
                      out_format = "long")

# Correlation between two matrices
cor_in2 <- data.frame(rows = rep(letters[1:10], each = 10),
                      cols = rep(letters[1:10], 10),
                      values = rnorm(100))
corr2 <- cor_long(cor_in, row, col, val,
                  cor_in2, rows, cols, values)


ggcorrheatmap documentation built on Aug. 25, 2025, 1:11 a.m.