# HellCor: The Hellinger Correlation In HellCor: The Hellinger Correlation

## Description

Empirical value of the Hellinger correlation between two continuous random variables X and Y.

## Usage

 ```1 2 3``` ```HellCor(x, y, Kmax = 20L, Lmax = 20L, K = 0L, L = 0L, alpha = 6.0, pval.comp = FALSE, conf.level = NULL, B1 = 200, B2 = 200, C.version = TRUE) ```

## Arguments

 `x` Sample of X-values. `y` Sample of Y-values. `Kmax` Maximal number of terms to consider in the expansion of the square root of the copula density in the basis of Legendre polynomials. `Lmax` Maximal number of terms to consider in the expansion of the square root of the copula density in the basis of Legendre polynomials. `K` If K and L are not equal to zero, and Kmax = Lmax = 0, the number of terms is set to K in the expansion of the square root of the copula density in the basis of Legendre polynomials. `L` If K and L are not equal to zero, and Kmax = Lmax = 0, the number of terms is set to L in the expansion of the square root of the copula density in the basis of Legendre polynomials. `alpha` Parameter of the Beta(alpha,alpha) distribution used to fix boundary issues through transformation when estimating the Hellinger correlation. `pval.comp` Logical. Should we compute the p-value? `conf.level` If not `NULL`, a number in (0, 1) that sets the confidence level for the computed confidence interval. `B1` Numeric. Number of Bootstrap replicates for the outer loop in the double bootstrap procedure used to obtain the confidence interval. Only used if `conf.level` is not `NULL`. For some examples, one might need to increase this value to 1,000. `B2` Numeric. Number of Bootstrap replicates for the inner loop in the double bootstrap procedure used to obtain the confidence interval. Only used if `conf.level` is not `NULL`. `C.version` Logical. If `FALSE`, the R version is used instead of the C version. Not really useful to change to `FALSE` unless one wants to understand the algorithm used.

## Details

When `Kmax = Lmax = K = L = 0`, the value returned in `Hcor` is the unnormalized version of the empirical Hellinger correlation.

## Value

List with the following components:

 `Hcor` Value of the empirical Hellinger correlation. `p.value` The p-value associated to the null hypothesis that the Hellinger correlation is equal to 0 (computed using a Monte-Carlo simulation). Will be `NA` if `pval.comp` is not set to `TRUE`. `conf.int` Confidence interval for the population Hellinger correlation (computed using a double Bootstrap). Will be `NA` if `conf.level` is not set to a value in (0,1). `Khat` Value of K selected by cross-validation, if computed, otherwise `NA`. `Lhat` Value of L selected by cross-validation, if computed, otherwise `NA`.

## Author(s)

Geenens G., Lafaye de Micheaux P.

## References

Geenens G., Lafaye de Micheaux P., (2018). The Hellinger correlation. (), –.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180``` ```# We illustrate the application of our new measure using data extracted # from a study on the coral reef published in # [Graham, N.A.J. and Wilson, S.K. and Carr, P. and Hoey, A.S. and # Jennings, S. and MacNeil, M.A. (2018), Seabirds enhance coral reef # productivity and functioning in the absence of invasive rats, Nature, # 559, 250--253.]. # The two variables we consider are the number of fishes and the nitrogen # input by seabirds per hectare recorded on n = 12 islands in the Chagos # Archipelago. Nitrogen input is an indirect measure of the abundance of # seabirds. It is worthwhile to notice that nitrogen is absorbed by algae # and that herbivorous fishes eat these algae. Fishes and birds living in # two different worlds, it might seem odd to suspect a dependence between # these two variables. Since our measure is tailored to capture dependence # induced by a lurking variable, we can suspect the existence of such a # hidden variable. This is indeed what researchers in (Graham et al., 2018) # were able to show by finding that the presence and abundance of rats on # an island had dramatic effects on the number of seabirds (the rats eating # their eggs) and consequently on the input of nitrogen in seabirds' guano. # In turn, this would diminish the abundance of algae and fishes eating # these algae. data(Chagos) n <- nrow(Chagos) par.save <- par()\$mfrow par(mfrow = c(1, 2)) plot(Chagos\$Seabirds_ha, Chagos\$Number_of_fishes, main = "Original data", xlab = "Density of seabirds", ylab = "Density of fishes") plot(rank(Chagos\$Seabirds_ha) / (n + 1), rank(Chagos\$Number_of_fishes) / (n + 1), main = "Rank-Copula transformed data", xlab = "Density of seabirds", ylab = "Density of fishes") par(mfrow = par.save) set.seed(1) # Empirical Hellinger correlation HellCor(Chagos\$Seabirds_ha, Chagos\$Number_of_fishes, pval.comp = TRUE) # Pearson correlation cor.test(Chagos\$Seabirds_ha, Chagos\$Number_of_fishes) # Distance correlation dcor.test(Chagos\$Seabirds_ha, Chagos\$Number_of_fishes, R = 200) set.seed(1) # Empirical Hellinger correlation HellCor(Chagos\$kg_N_ha_yr, Chagos\$Seabirds_ha, pval.comp = TRUE) # Pearson correlation cor.test(Chagos\$kg_N_ha_yr, Chagos\$Seabirds_ha) # Distance correlation dcor.test(Chagos\$kg_N_ha_yr, Chagos\$Seabirds_ha, R = 200) set.seed(1) # Empirical Hellinger correlation HellCor(Chagos\$kg_N_ha_yr, Chagos\$Number_of_fishes, pval.comp = TRUE) # Pearson correlation cor.test(Chagos\$kg_N_ha_yr, Chagos\$Number_of_fishes) # Distance correlation dcor.test(Chagos\$kg_N_ha_yr, Chagos\$Number_of_fishes, R = 200) t.test(Chagos\$kg_N_ha_yr ~ Chagos\$Treatment) ###################################################################### # Geenens G., Lafaye de Micheaux P., (2020). The Hellinger correlation ###################################################################### # Figure 5.2 ## Not run: n <- 500 set.seed(1) par(mfrow = c(3, 5)) XX <- .datagenW.corrected(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(W~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenDiamond(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Diamond~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenParabola.corrected(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Parabola~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagen2Parabolas.corrected(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat<-HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Two~~parabolae~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenCircle.corrected(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Circle~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagen4indclouds(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(4~~clouds~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenCubic(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Cubic~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenSine(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Sine~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenWedge(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Wedge~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenCross(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Cross~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenSpiral(n, sigma = 0.05) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Spiral~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagen4Circles(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Circles~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenHeavisine(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Heavisine~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagenDoppler(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat<-HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(Doppler~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) XX <- .datagen5Clouds(n) plot(XX, xlab = expression(X[1]), ylab = expression(X[2])) etahat <- HellCor(XX[,1], XX[,2])\$Hcor title(main = substitute(5~~clouds~~-~~hat(eta)==etahat, list(etahat = round(etahat, 3)))) ## End(Not run) ############################ # Birth rate vs Death rate # ############################ data(worlddemographics) x <- wdemographics\$Death.Rate.Pop y <- wdemographics\$Birth.Rate.Pop plot(x, y, xlab = "DEATHS/1,000 POPULATION", ylab = "BIRTHS/1,000 POPULATION", main = "Birth rate vs Death rate for 229 countries and territories (in 2020)", col = "orangered", pch = 20) text(x, y, labels = wdemographics\$Country, pos = 3, cex = 0.4, offset = 0.2) ## Not run: HellCor(x, y, pval.comp = TRUE) cor.test(x, y) # Pearson dcor.test(x, y, R = 200) ## End(Not run) ```

HellCor documentation built on July 2, 2020, 3:57 a.m.