dcmatrix: Calculates distance covariance and distance correlation...

dcmatrixR Documentation

Calculates distance covariance and distance correlation matrices

Description

Calculates distance covariance and distance correlation matrices

Usage

dcmatrix(
  X,
  Y = NULL,
  calc.dcov = TRUE,
  calc.dcor = TRUE,
  calc.cor = "none",
  calc.pvalue.cor = FALSE,
  return.data = TRUE,
  test = "none",
  adjustp = "none",
  b = 499,
  affine = FALSE,
  standardize = FALSE,
  bias.corr = FALSE,
  group.X = NULL,
  group.Y = NULL,
  metr.X = "euclidean",
  metr.Y = "euclidean",
  use = "all",
  algorithm = "auto",
  fc.discrete = FALSE,
  calc.dcor.pw = FALSE,
  calc.dcov.pw = FALSE,
  test.pw = "none",
  metr.pw.X = "euclidean",
  metr.pw.Y = "euclidean"
)

Arguments

X

A data.frame or matrix.

Y

Either NULL or a data.frame or a matrix with the same number of rows as X. If only X is provided, distance covariances/correlations are calculated between all groups in X. If X and Y are provided, distance covariances/correlations are calculated between all groups in X and all groups of Y.

calc.dcov

logical; specifies if the distance covariance matrix is calculated.

calc.dcor

logical; specifies if the distance correlation matrix is calculated.

calc.cor

If set as "pearson", "spearman" or "kendall", a corresponding correlation matrix is additionally calculated.

calc.pvalue.cor

logical; IF TRUE, a p-value based on the Pearson or Spearman correlation matrix is calculated (not implemented for calc.cor ="kendall") using Hmisc::rcorr.

return.data

logical; specifies if the dcmatrix object should contain the original data.

test

specifies the type of test that is performed, "permutation" performs a Monte Carlo Permutation test. "gamma" performs a test based on a gamma approximation of the test statistic under the null. "conservative" performs a conservative two-moment approximation. "bb3" performs a quite precise three-moment approximation and is recommended when computation time is not an issue.

adjustp

If setting this parameter to "holm", "hochberg", "hommel", "bonferroni", "BH", "BY" or "fdr", corresponding adjusted p-values are additionally returned for the distance covariance test.

b

specifies the number of random permutations used for the permutation test. Ignored for all other tests.

affine

logical; indicates if the affinely transformed distance covariance should be calculated or not.

standardize

specifies if data should be standardized dividing each component by its standard deviations. No effect when affine = TRUE.

bias.corr

logical; specifies if the bias corrected version of the sample distance covariance \insertCitehuo2016fastdcortools should be calculated.

group.X

A vector, each entry specifying the group membership of the respective column in X. Each group is handled as one sample for calculating the distance covariance/correlation matrices. If NULL, every sample is handled as an individual group.

group.Y

A vector, each entry specifying the group membership of the respective column in Y. Each group is handled as one sample for calculating the distance covariance/correlation matrices. If NULL, every sample is handled as an individual group.

metr.X

Either a single metric or a list providing a metric for each group in X (see examples).

metr.Y

see metr.X.

use

"all" uses all observations, "complete.obs" excludes NAs, "pairwise.complete.obs" uses pairwise complete observations for each comparison.

algorithm

specifies the algorithm used for calculating the distance covariance.

"fast" uses an O(n log n) algorithm if the observations are one-dimensional and metr.X and metr.Y are either "euclidean" or "discrete", see also \insertCitehuo2016fast;textualdcortools.

"memsave" uses a memory saving version of the standard algorithm with computational complexity O(n^2) but requiring only O(n) memory.

"standard" uses the classical algorithm. User-specified metrics always use the classical algorithm.

"auto" chooses the best algorithm for the specific setting using a rule of thumb.

"memsave" is typically very inefficient for dcmatrix and should only be applied in exceptional cases.

fc.discrete

logical; If TRUE, "discrete" metric is applied automatically on samples of type "factor" or "character".

calc.dcor.pw

logical; If TRUE, a distance correlation matrix between the univariate observations/columns is additionally calculated. Not meaningful if group.X and group.Y are not specified.

calc.dcov.pw

logical; If TRUE, a distance covariance matrix between the univariate observations/columns is additionally calculated. Not meaningful if group.X and group.Y are not specified.

test.pw

specifies a test (see argument "test") that is performed between all single observations.

metr.pw.X

Either a single metric or a list providing a metric for each single observation/column in X (see metr.X).

metr.pw.Y

See metr.pw.Y.

Value

S3 object of class "dcmatrix" with the following components

name X, Y

description original data (if return.data = TRUE).

name dcov, dcor

distance covariance/correlation matrices between the groups specified in group.X/group.Y (if calc.dcov/calc.dcor = TRUE).

name corr

correlation matrix between the univariate observations/columns (if cal.cor is "pearson", "spearman" or "kendall").

name pvalue

matrix of p-values based on a corresponding distance covariance test based on the entries in dcov (if argument test is not "none").

name pvalue.adj

matrix of p-values adjusted for multiple comparisons using the method specified in argument adjustp.

name pvalue.cor

matrix of pvalues based on "pearson"/"spearman" correlation (if calc.cor is "pearson" or "spearman" and calc.pvalue.cor = TRUE).

name dcov.pw,dcor.pw

distance covariance/correlation matrices between the univariate observations (if calc.dcov.pw/calc.dcor.pw = TRUE.)

name pvalue.pw

matrix of p-values based on a corresponding distance covariance test based on the entries in dcov.pw (if argument test is not "none").

References

\insertRef

berschneider2018complexdcortools

\insertRef

bottcher2017detectingdcortools

\insertRef

dueck2014affinelydcortools

\insertRef

huang2017statisticallydcortools

\insertRef

huo2016fastdcortools

\insertRef

lyons2013distancedcortools

\insertRef

sejdinovic2013equivalencedcortools

\insertRef

szekely2007dcortools

\insertRef

szekely2009browniandcortools

Examples

X <- matrix(rnorm(1000), ncol = 10)

dcm <- dcmatrix(X, test="bb3",calc.cor = "pearson",
 calc.pvalue.cor = TRUE, adjustp = "BH") 
 
dcm <- dcmatrix(X, test="bb3",calc.cor = "pearson", 
 calc.pvalue.cor = TRUE, adjustp = "BH", 
 group.X = c(rep(1, 5), rep(2, 5)), 
 calc.dcor.pw = TRUE, test.pw = "bb3")


Y <- matrix(rnorm(600), ncol = 6)

Y[,6] <- rbinom(100, 4, 0.3)

dcm <- dcmatrix(X, Y, test="bb3",calc.cor = "pearson",
 calc.pvalue.cor = TRUE, adjustp = "BH")
  
dcm <- dcmatrix(X, Y, test="bb3",calc.cor = "pearson",
 calc.pvalue.cor = TRUE, adjustp = "BH",
 group.X = c(rep("group1", 5), rep("group2", 5)),
 group.Y = c(rep("group1", 5), "group2"), 
 metr.X = "gaussauto",
 metr.Y = list("group1" = "gaussauto", "group2" = "discrete"))

dcortools documentation built on Dec. 8, 2022, 1:11 a.m.