cor_test_pairs: Correlation Testing for Multiple Endpoints/Terms

View source: R/pairwise_comparisons.R

cor_test_pairsR Documentation

Correlation Testing for Multiple Endpoints/Terms

Description

Takes a continuous variable and a categorical variable, and calculates the Spearman, Pearson, or Kendall correlation estimate and p-value between the categorical variable levels.

Usage

cor_test_pairs(
  x,
  pair,
  id,
  method = c("spearman", "pearson", "kendall"),
  n_distinct_value = 3,
  digits = 3,
  trailing_zeros = TRUE,
  exact = TRUE,
  seed = 68954857,
  nresample = 10000,
  verbose = FALSE,
  ...
)

Arguments

x

numeric vector (can include NA values)

pair

categorical vector which contains the levels to compare

id

vector which contains the id information

method

character string indicating which correlation coefficient is to be used for the test ("pearson" (default), "kendall", or "spearman").

n_distinct_value

number of distinct values in x each pair must contain to be compared. The value must be >1, with a default of 3.

digits

numeric value between 0 and 14 indicating the number of digits to round the correlation estimate. The default is set to 3.

trailing_zeros

logical indicating if trailing zeros should be included in the descriptive statistics (i.e. 0.100 instead of 0.1). Note if set to TRUE, output is a character vector.

exact

logical value indicating whether the "exact" method should be used. Ignored if method = "pearson" or if method = "spearman" and there are ties in x for either pair.

seed

numeric value used to set the seed. Only used if method = "spearman" and there are ties in x for either pair.

nresample

positive integer indicating the number of Monte Carlo replicates to used for the computation of the approximative reference distribution. Defaults is set to 10,000. Only used when method = "spearman" and there are ties in x for either pair.

verbose

logical variable indicating whether warnings and messages should be displayed.

...

parameters passed to stats::cor.test or coin:spearman_test

Details

The p value is calculated using the cor_test function (see documentation for method details)

If a pair has less than n_distinct_value non-missing values that pair will be excluded from the comparisons. If a specific comparison has less than n_distinct_value non-missing values to comparison the output will return an estimate and the p-value set to NA.

Value

Returns a data frame of all possible pairwise correlations with pair sizes greater than or equal to the minimum number of values in pair, as set by n_distinct_value:

  • Correlation - Comparisons made

  • NPairs - number of non-missing pairs considered

  • Ties - are ties present in either variable

  • CorrEst - correlation estimates

  • CorrTest - correlation test p value

Examples


data_in <- data.frame(
  id = 1:10,
  x = c(-2, -1, 0, 1, 2,-2, -1, 0, 1, 2),
  y = c(4, 1, NA, 1, 4,-2, -1, 0, 1, 2),
  z = c(1, 2, 3, 4, NA,-2, -1, 0, 1, 2),
  v = c(rep(1,10)),
  aa = c(1:5,NA,NA,NA,NA,NA),
  bb = c(NA,NA,NA,NA,NA,1:5)
)
data_in_long <- tidyr::pivot_longer(data_in, -id)
cor_test_pairs(x = data_in_long$value,
                  pair = data_in_long$name,
                  id = data_in_long$id,
                  method = 'spearman')


# Examples with Real World Data
library(dplyr)

# BAMA Assay Data Example
data(exampleData_BAMA)

## Antigen Correlation
exampleData_BAMA %>%
filter(visitno != 0) %>%
group_by(group, visitno) %>%
 summarize(
   cor_test_pairs(x = magnitude, pair = antigen, id = pubID,
   method = 'spearman', n_distinct_value = 3, digits = 1, verbose = TRUE),
   .groups = 'drop'
          )


FredHutch/VISCfunctions documentation built on Oct. 14, 2024, 11:33 p.m.