bcor | R Documentation |
Computes Ball Covariance and Ball Correlation statistics, which are generic dependence measures in Banach spaces.
bcor(x, y, distance = FALSE, weight = FALSE) bcov(x, y, distance = FALSE, weight = FALSE)
x |
a numeric vector, matrix, data.frame, or a list containing at least two numeric vectors, matrices, or data.frames. |
y |
a numeric vector, matrix, or data.frame. |
distance |
if |
weight |
a logical or character string used to choose the weight form of Ball Covariance statistic..
If input is a character string, it must be one of |
The sample sizes of the two variables must agree, and samples must not contain missing and infinite values.
If we set distance = TRUE
, arguments x
, y
can be a dist
object or a
symmetric numeric matrix recording distance between samples; otherwise, these arguments are treated as data.
bcov
and bcor
compute Ball Covariance and Ball Correlation statistics.
Ball Covariance statistics is a generic dependence measure in Banach spaces. It enjoys the following properties:
It is nonnegative and it is equal to zero if and only if variables are unassociated;
It is highly robust;
It is distribution-free and model-free;
it is interesting that the HHG is a special case of Ball Covariance statistics.
Ball correlation statistics, a normalized version of Ball Covariance statistics, generalizes Pearson correlation in two fundamental ways:
It is well-defined for random variables in arbitrary dimension in Banach spaces
BCor is equal to zero implies random variables are unassociated.
The definitions of the Ball Covariance and Ball Correlation statistics between two random variables are as follows. Suppose, we are given pairs of independent observations \{(x_1, y_1),...,(x_n,y_n)\}, where x_i and y_i can be of any dimension and the dimensionality of x_i and y_i need not be the same. Then, we define sample version Ball Covariance as:
\mathbf{BCov}_{ω, n}^{2}(X, Y)=\frac{1}{n^{2}}∑_{i,j=1}^{n}{(Δ_{ij,n}^{X,Y}-Δ_{ij,n}^{X}Δ_{ij,n}^{Y})^{2}}
where:
Δ_{ij,n}^{X,Y}=\frac{1}{n}∑_{k=1}^{n}{δ_{ij,k}^{X} δ_{ij,k}^{Y}}, Δ_{ij,n}^{X}=\frac{1}{n}∑_{k=1}^{n}{δ_{ij,k}^{X}}, Δ_{ij,n}^{Y}=\frac{1}{n}∑_{k=1}^{n}{δ_{ij,k}^{Y}}
δ_{ij,k}^{X} = I(x_{k} \in \bar{B}(x_{i}, ρ(x_{i}, x_{j}))), δ_{ij,k}^{Y} = I(y_{k} \in \bar{B}(y_{i}, ρ(y_{i}, y_{j})))
Among them, \bar{B}(x_{i}, ρ(x_{i}, x_{j})) is a closed ball with center x_{i} and radius ρ(x_{i}, x_{j}). Similarly, we can define \mathbf{BCov}_{ω,n}^2(\mathbf{X},\mathbf{X}) and \mathbf{BCov}_{ω,n}^2(\mathbf{Y},\mathbf{Y}) . We define Ball Correlation statistic as follows.
\mathbf{BCor}_{ω,n}^2(\mathbf{X},\mathbf{Y})= \mathbf{BCov}_{ω,n}^2(\mathbf{X},\mathbf{Y})/√{\mathbf{BCov}_{ω,n}^2(\mathbf{X},\mathbf{X})\mathbf{BCov}_{ω,n}^2(\mathbf{Y},\mathbf{Y})}
We can extend \mathbf{BCov}_{ω,n} to measure the mutual independence between K random variables:
\frac{1}{n^{2}}∑_{i,j=1}^{n}{≤ft[ (Δ_{ij,n}^{X_{1}, ..., X_{K}}-∏_{k=1}^{K}Δ_{ij,n}^{X_{k}})^{2}∏_{k=1}^{K}{\hat{ω}_{k}(X_{ki},X_{kj})} \right]}
where X_{k}(k=1,…,K) are random variables and X_{ki} is the i-th observations of X_{k}.
See bcov.test
for a test of independence based on the Ball Covariance statistic.
|
Ball Correlation statistic. |
|
Ball Covariance statistic. |
Wenliang Pan, Xueqin Wang, Heping Zhang, Hongtu Zhu & Jin Zhu (2019) Ball Covariance: A Generic Measure of Dependence in Banach Space, Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1543600
Wenliang Pan, Xueqin Wang, Weinan Xiao & Hongtu Zhu (2018) A Generic Sure Independence Screening Procedure, Journal of the American Statistical Association, DOI: 10.1080/01621459.2018.1462709
Jin Zhu, Wenliang Pan, Wei Zheng, and Xueqin Wang (2021). Ball: An R Package for Detecting Distribution Difference and Association in Metric Spaces, Journal of Statistical Software, Vol.97(6), doi: 10.18637/jss.v097.i06.
bcov.test
, bcorsis
############# Ball Correlation ############# num <- 50 x <- 1:num y <- 1:num bcor(x, y) bcor(x, y, weight = "prob") bcor(x, y, weight = "chisq") ############# Ball Covariance ############# num <- 50 x <- rnorm(num) y <- rnorm(num) bcov(x, y) bcov(x, y, weight = "prob") bcov(x, y, weight = "chisq")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.