HellCor: The Hellinger Correlation

Description Usage Arguments Details Value Author(s) References Examples

View source: R/HellCor.R

Description

Empirical value of the Hellinger correlation between two continuous random variables X and Y.

Usage

1
2
3
HellCor(x, y, Kmax = 20L, Lmax = 20L, K = 0L, L = 0L,
        alpha = 6.0, pval.comp = FALSE, conf.level = NULL,
        B1 = 200, B2 = 200, C.version = TRUE)

Arguments

x

Sample of X-values.

y

Sample of Y-values.

Kmax

Maximal number of terms to consider in the expansion of the square root of the copula density in the basis of Legendre polynomials.

Lmax

Maximal number of terms to consider in the expansion of the square root of the copula density in the basis of Legendre polynomials.

K

If K and L are not equal to zero, and Kmax = Lmax = 0, the number of terms is set to K in the expansion of the square root of the copula density in the basis of Legendre polynomials.

L

If K and L are not equal to zero, and Kmax = Lmax = 0, the number of terms is set to L in the expansion of the square root of the copula density in the basis of Legendre polynomials.

alpha

Parameter of the Beta(alpha,alpha) distribution used to fix boundary issues through transformation when estimating the Hellinger correlation.

pval.comp

Logical. Should we compute the p-value?

conf.level

If not NULL, a number in (0, 1) that sets the confidence level for the computed confidence interval.

B1

Numeric. Number of Bootstrap replicates for the outer loop in the double bootstrap procedure used to obtain the confidence interval. Only used if conf.level is not NULL. For some examples, one might need to increase this value to 1,000.

B2

Numeric. Number of Bootstrap replicates for the inner loop in the double bootstrap procedure used to obtain the confidence interval. Only used if conf.level is not NULL.

C.version

Logical. If FALSE, the R version is used instead of the C version. Not really useful to change to FALSE unless one wants to understand the algorithm used.

Details

When Kmax = Lmax = K = L = 0, the value returned in Hcor is the unnormalized version of the empirical Hellinger correlation.

Value

List with the following components:

Hcor

Value of the empirical Hellinger correlation.

p.value

The p-value associated to the null hypothesis that the Hellinger correlation is equal to 0 (computed using a Monte-Carlo simulation). Will be NA if pval.comp is not set to TRUE.

conf.int

Confidence interval for the population Hellinger correlation (computed using a double Bootstrap). Will be NA if conf.level is not set to a value in (0,1).

Khat

Value of K selected by cross-validation, if computed, otherwise NA.

Lhat

Value of L selected by cross-validation, if computed, otherwise NA.

Author(s)

Geenens G., Lafaye de Micheaux P.

References

Geenens G., Lafaye de Micheaux P., (2018). The Hellinger correlation. (), –.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
# We illustrate the application of our new measure using data extracted
#  from a study on the coral reef published in
#  [Graham, N.A.J. and Wilson, S.K. and Carr, P. and Hoey, A.S. and
#   Jennings, S. and MacNeil, M.A. (2018), Seabirds enhance coral reef
#   productivity and functioning in the absence of invasive rats, Nature,
#   559, 250--253.]. 
# The two variables we consider are the number of fishes and the nitrogen 
# input by seabirds per hectare recorded on n = 12 islands in the Chagos
# Archipelago. Nitrogen input is an indirect measure of the abundance of
# seabirds. It is worthwhile to notice that nitrogen is absorbed by algae 
# and that herbivorous fishes eat these algae. Fishes and birds living in 
# two different worlds, it might seem odd to suspect a dependence between 
# these two variables. Since our measure is tailored to capture dependence 
# induced by a lurking variable, we can suspect the existence of such a
# hidden variable. This is indeed what researchers in (Graham et al., 2018) 
# were able to show by finding that the presence and abundance of rats on 
# an island had dramatic effects on the number of seabirds (the rats eating 
# their eggs) and consequently on the input of nitrogen in seabirds' guano. 
# In turn, this would diminish the abundance of algae and fishes eating
# these algae.

data(Chagos)
n <- nrow(Chagos)

par.save <- par()$mfrow
par(mfrow = c(1, 2))
plot(Chagos$Seabirds_ha, Chagos$Number_of_fishes, main = "Original data",
     xlab = "Density of seabirds", ylab = "Density of fishes")
plot(rank(Chagos$Seabirds_ha) / (n + 1), rank(Chagos$Number_of_fishes) /
     (n + 1), main = "Rank-Copula transformed data",
     xlab = "Density of seabirds", ylab = "Density of fishes")
par(mfrow = par.save)

set.seed(1)
# Empirical Hellinger correlation
HellCor(Chagos$Seabirds_ha, Chagos$Number_of_fishes, pval.comp = TRUE)
# Pearson correlation
cor.test(Chagos$Seabirds_ha, Chagos$Number_of_fishes)
# Distance correlation
dcor.test(Chagos$Seabirds_ha, Chagos$Number_of_fishes, R = 200)


set.seed(1)
# Empirical Hellinger correlation
HellCor(Chagos$kg_N_ha_yr, Chagos$Seabirds_ha, pval.comp = TRUE)
# Pearson correlation
cor.test(Chagos$kg_N_ha_yr, Chagos$Seabirds_ha)
# Distance correlation
dcor.test(Chagos$kg_N_ha_yr, Chagos$Seabirds_ha, R = 200)

set.seed(1)
# Empirical Hellinger correlation
HellCor(Chagos$kg_N_ha_yr, Chagos$Number_of_fishes, pval.comp = TRUE)
# Pearson correlation
cor.test(Chagos$kg_N_ha_yr, Chagos$Number_of_fishes)
# Distance correlation
dcor.test(Chagos$kg_N_ha_yr, Chagos$Number_of_fishes, R = 200)

t.test(Chagos$kg_N_ha_yr ~ Chagos$Treatment)

######################################################################
# Geenens G., Lafaye de Micheaux P., (2020). The Hellinger correlation
######################################################################
# Figure 5.2

## Not run: 
n <- 500

set.seed(1)
par(mfrow = c(3, 5))

XX <- .datagenW.corrected(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(W~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenDiamond(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Diamond~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenParabola.corrected(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Parabola~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagen2Parabolas.corrected(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat<-HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Two~~parabolae~~-~~hat(eta)==etahat, list(etahat
= round(etahat, 3))))

XX <- .datagenCircle.corrected(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Circle~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagen4indclouds(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(4~~clouds~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenCubic(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Cubic~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenSine(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Sine~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenWedge(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Wedge~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenCross(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Cross~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenSpiral(n, sigma = 0.05)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Spiral~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagen4Circles(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Circles~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenHeavisine(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Heavisine~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagenDoppler(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat<-HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(Doppler~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

XX <- .datagen5Clouds(n)
plot(XX, xlab = expression(X[1]), ylab = expression(X[2]))
etahat <- HellCor(XX[,1], XX[,2])$Hcor
title(main = substitute(5~~clouds~~-~~hat(eta)==etahat, list(etahat =
round(etahat, 3))))

## End(Not run)

############################
# Birth rate vs Death rate #
############################

data(worlddemographics)
x <- wdemographics$Death.Rate.Pop
y <- wdemographics$Birth.Rate.Pop
plot(x, y, xlab = "DEATHS/1,000 POPULATION", ylab =
"BIRTHS/1,000 POPULATION", main =
"Birth rate vs Death rate for 229 countries and territories (in 2020)", col = "orangered", pch = 20)
text(x, y, labels = wdemographics$Country, pos = 3, cex = 0.4, offset = 0.2)
## Not run: 
 HellCor(x, y, pval.comp = TRUE)
 cor.test(x, y) # Pearson
 dcor.test(x, y, R = 200)

## End(Not run)

HellCor documentation built on July 2, 2020, 3:57 a.m.

Related to HellCor in HellCor...