dcorr: Distance Correlation

View source: R/reg_dcorr.R

dcorrR Documentation

Distance Correlation

Description

It estimates the Distance Correlation coefficient (dcorr) for a continuous predicted-observed dataset.

Usage

dcorr(data = NULL, obs, pred, tidy = FALSE, na.rm = TRUE)

Arguments

data

(Optional) argument to call an existing data frame containing the data.

obs

Vector with observed values (numeric).

pred

Vector with predicted values (numeric).

tidy

logical operator (TRUE/FALSE) to decide the type of return. TRUE returns a data.frame, FALSE returns a list (default).

na.rm

Logic argument to remove rows with missing values (NA). Default is na.rm = TRUE.

Details

The dcorr function is a wrapper for the dcor function from the energy-package. See Rizzo & Szekely (2022). The distance correlation (dcorr) coefficient is a novel measure of dependence between random vectors introduced by Szekely et al. (2007).

The dcorr is characterized for being symmetric, which is relevant for the predicted-observed case (PO).

For all distributions with finite first moments, distance correlation \mathcal R generalizes the idea of correlation in two fundamental ways:

(1) \mathcal R(P,O) is defined for P and O in arbitrary dimension.

(2) \mathcal R(P,O)=0 characterizes independence of P and O.

Distance correlation satisfies 0 \le \mathcal R \le 1, and \mathcal R = 0 only if P and O are independent. Distance covariance \mathcal V provides a new approach to the problem of testing the joint independence of random vectors. The formal definitions of the population coefficients \mathcal V and \mathcal R are given in Szekely et al. (2007).

The empirical distance correlation \mathcal{R}_n(\mathbf{P,O}) is the square root of

\mathcal{R}^2_n(\mathbf{P,O})= \frac {\mathcal{V}^2_n(\mathbf{P,O})} {\sqrt{ \mathcal{V}^2_n (\mathbf{P}) \mathcal{V}^2_n(\mathbf{O})}}.

For the formula and more details, see online-documentation and the energy-package

Value

an object of class numeric within a list (if tidy = FALSE) or within a ⁠data frame⁠ (if tidy = TRUE).

References

Szekely, G.J., Rizzo, M.L., and Bakirov, N.K. (2007). Measuring and testing dependence by correaltion of distances. Annals of Statistics, Vol. 35(6): 2769-2794. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1214/009053607000000505")}.

Rizzo, M., and Szekely, G. (2022). energy: E-Statistics: Multivariate Inference via the Energy of Data. R package version 1.7-10. https://CRAN.R-project.org/package=energy.

See Also

eval_tidy, defusing-advanced dcor, energy

Examples


set.seed(1)
P <- rnorm(n = 100, mean = 0, sd = 10)
O <- P + rnorm(n=100, mean = 0, sd = 3)
dcorr(obs = P, pred = O)


metrica documentation built on June 30, 2024, 5:07 p.m.