| histogram_stats | R Documentation |
Functions to compute the mean, variance, covariance, and correlation of histogram-valued data.
hist_mean(x, var_name, method = "BG", ...)
hist_var(x, var_name, method = "BG", ...)
hist_cov(x, var_name1, var_name2, method = "BG", ...)
hist_cor(x, var_name1, var_name2, method = "BG", ...)
x |
histogram-valued data object. |
var_name |
the variable name or the column location. |
method |
method to calculate statistics. One of |
... |
additional parameters. |
var_name1 |
the variable name or the column location. |
var_name2 |
the variable name or the column location. |
Four functions are provided:
hist_mean: Compute the mean of histogram-valued data.
hist_var: Compute the variance of histogram-valued data.
hist_cov: Compute the covariance between two histogram-valued variables.
hist_cor: Compute the correlation between two histogram-valued variables.
Four methods are supported for all functions:
Bertrand and Goupil (2000) method. Uses histogram bin boundaries and probabilities to compute first and second moments.
Billard and Diday (2006) method. A signed decomposition using the sign of each bin's midpoint deviation from the overall mean and a quadratic form on the bin boundaries.
Billard (2008) method. Uses cross-products of deviations of the bin boundaries from the overall mean.
L2 Wasserstein method. Uses optimal-transport (Wasserstein) distances between the quantile functions of the histogram distributions.
For the mean, BG, BD, and B return the same value because they share the same first-order moment definition; only L2W uses a different (quantile-based) mean. For variance, covariance, and correlation, all four methods generally produce different results.
For hist_cor, the BG, BD, and B correlations all use the
Bertrand-Goupil standard deviation S(Y) in the denominator, following
Irpino and Verde (2015, Eqs. 30–32). Only the L2W method uses its own
Wasserstein-based standard deviation in the denominator.
A numeric value or vector for hist_mean and hist_var; a single numeric value for hist_cov and hist_cor.
Po-Wei Chen, Han-Ming Wu
int_mean int_var int_cov int_cor
library(HistDAWass)
x <- HistDAWass::BLOOD
hist_mean(x, var_name = "Cholesterol", method = "BG")
hist_mean(x, var_name = "Cholesterol", method = "BD")
hist_var(x, var_name = "Cholesterol", method = "BG")
hist_var(x, var_name = "Cholesterol", method = "BD")
hist_cov(x, var_name1 = "Cholesterol", var_name2 = "Hemoglobin", method = "BG")
hist_cor(x, var_name1 = "Cholesterol", var_name2 = "Hemoglobin", method = "BG")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.