bcov: Ball Covariance and Correlation Statistics
In Ball: Statistical Inference and Sure Independence Screening via Ball Statistics

View source: R/bcov.R

bcor	R Documentation

Ball Covariance and Correlation Statistics

Description

Computes Ball Covariance and Ball Correlation statistics, which are generic dependence measures in Banach spaces.

Usage

bcor(x, y, distance = FALSE, weight = FALSE)

bcov(x, y, distance = FALSE, weight = FALSE)

Arguments

`x`	a numeric vector, matrix, data.frame, or a list containing at least two numeric vectors, matrices, or data.frames.
`y`	a numeric vector, matrix, or data.frame.
`distance`	if `distance = TRUE`, the elements of `x` and `y` are considered as distance matrices.
`weight`	a logical or character string used to choose the weight form of Ball Covariance statistic.. If input is a character string, it must be one of `"constant"`, `"probability"`, or `"chisquare"`. Any unambiguous substring can be given. If input is a logical value, it is equivalent to `weight = "probability"` if `weight = TRUE` while equivalent to `weight = "constant"` if `weight = FALSE`. Default: `weight = FALSE`.

Details

The sample sizes of the two variables must agree, and samples must not contain missing and infinite values. If we set distance = TRUE, arguments x, y can be a dist object or a symmetric numeric matrix recording distance between samples; otherwise, these arguments are treated as data.

bcov and bcor compute Ball Covariance and Ball Correlation statistics.

Ball Covariance statistics is a generic dependence measure in Banach spaces. It enjoys the following properties:

It is nonnegative and it is equal to zero if and only if variables are unassociated;
It is highly robust;
It is distribution-free and model-free;
it is interesting that the HHG is a special case of Ball Covariance statistics.

Ball correlation statistics, a normalized version of Ball Covariance statistics, generalizes Pearson correlation in two fundamental ways:

It is well-defined for random variables in arbitrary dimension in Banach spaces
BCor is equal to zero implies random variables are unassociated.

The definitions of the Ball Covariance and Ball Correlation statistics between two random variables are as follows. Suppose, we are given pairs of independent observations \{(x_1, y_1),...,(x_n,y_n)\}, where x_i and y_i can be of any dimension and the dimensionality of x_i and y_i need not be the same. Then, we define sample version Ball Covariance as:

\mathbf{BCov}_{ω, n}^{2}(X, Y)=\frac{1}{n^{2}}∑_{i,j=1}^{n}{(Δ_{ij,n}^{X,Y}-Δ_{ij,n}^{X}Δ_{ij,n}^{Y})^{2}}

where:

Δ_{ij,n}^{X,Y}=\frac{1}{n}∑_{k=1}^{n}{δ_{ij,k}^{X} δ_{ij,k}^{Y}}, Δ_{ij,n}^{X}=\frac{1}{n}∑_{k=1}^{n}{δ_{ij,k}^{X}}, Δ_{ij,n}^{Y}=\frac{1}{n}∑_{k=1}^{n}{δ_{ij,k}^{Y}}

δ_{ij,k}^{X} = I(x_{k} \in \bar{B}(x_{i}, ρ(x_{i}, x_{j}))), δ_{ij,k}^{Y} = I(y_{k} \in \bar{B}(y_{i}, ρ(y_{i}, y_{j})))

Among them, \bar{B}(x_{i}, ρ(x_{i}, x_{j})) is a closed ball with center x_{i} and radius ρ(x_{i}, x_{j}). Similarly, we can define \mathbf{BCov}_{ω,n}^2(\mathbf{X},\mathbf{X}) and \mathbf{BCov}_{ω,n}^2(\mathbf{Y},\mathbf{Y}) . We define Ball Correlation statistic as follows.

\mathbf{BCor}_{ω,n}^2(\mathbf{X},\mathbf{Y})= \mathbf{BCov}_{ω,n}^2(\mathbf{X},\mathbf{Y})/√{\mathbf{BCov}_{ω,n}^2(\mathbf{X},\mathbf{X})\mathbf{BCov}_{ω,n}^2(\mathbf{Y},\mathbf{Y})}

We can extend \mathbf{BCov}_{ω,n} to measure the mutual independence between K random variables:

\frac{1}{n^{2}}∑_{i,j=1}^{n}{≤ft[ (Δ_{ij,n}^{X_{1}, ..., X_{K}}-∏_{k=1}^{K}Δ_{ij,n}^{X_{k}})^{2}∏_{k=1}^{K}{\hat{ω}_{k}(X_{ki},X_{kj})} \right]}

where X_{k}(k=1,…,K) are random variables and X_{ki} is the i-th observations of X_{k}.

See bcov.test for a test of independence based on the Ball Covariance statistic.

Value

`bcor`	Ball Correlation statistic.
`bcov`	Ball Covariance statistic.

References

Wenliang Pan, Xueqin Wang, Heping Zhang, Hongtu Zhu & Jin Zhu (2019) Ball Covariance: A Generic Measure of Dependence in Banach Space, Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1543600

Wenliang Pan, Xueqin Wang, Weinan Xiao & Hongtu Zhu (2018) A Generic Sure Independence Screening Procedure, Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1462709

Jin Zhu, Wenliang Pan, Wei Zheng, and Xueqin Wang (2021). Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces, Journal of Statistical Software, Vol.97(6), doi: 10.18637/jss.v097.i06.

Examples

############# Ball Correlation #############
num <- 50
x <- 1:num
y <- 1:num
bcor(x, y)
bcor(x, y, weight = "prob")
bcor(x, y, weight = "chisq")
############# Ball Covariance #############
num <- 50
x <- rnorm(num)
y <- rnorm(num)
bcov(x, y)
bcov(x, y, weight = "prob")
bcov(x, y, weight = "chisq")

Ball documentation built on Feb. 16, 2023, 7:50 p.m.