Multiple regression analysis for histogram variables based on a two component model and L2 Wasserstein distance


The function implements Multiple regression analysis for histogram variables based on a two component model and L2 Wasserstein distance. Taking as imput dependent histogram variable and a set of explanatory histogram variables the methods return a least squares estimation of a two component regression model based on the decomposition of L2 Wasserstein metric for distributional data.


WH.regression.two.components(data, Yvar, Xvars, simplify = FALSE, qua = 20)



A MatH object (a matrix of distributionH).


An integer, the dependent variable number in data.


A set of integers the explanantory variables in data.


a logical argument (default=FALSE). If TRUE only few equally spaced quantiles are considered (for speeding up the algorithm)


If simplify=TRUE is the number of quantiles to consider.


A two component regression model is implemented. The observed variables are histogram variables according to the definition given in the framework of Symbolic Data Analysis and the parameters of the model are estimated using the classic Least Squares method. An appropriate metric is introduced in order to measure the error between the observed and the predicted distributions. In particular, the Wasserstein distance is proposed. Such a metric permits to predict the response variable as direct linear combination of other independent histogram variables.


a named vector with the model estimated parameters


Irpino A, Verde R (in press 2015). Linear regression for numeric symbolic variables: a least squares approach based on Wasserstein Distance. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, ISSN: 1862-5347, DOI:10.1007/s11634-015-0197-7
An extended version is available on arXiv repository arXiv:1202.1436v2


model.parameters=WH.regression.two.components(data = BLOOD,Yvar = 1, Xvars= c(2:3))

Want to suggest features or report bugs for Use the GitHub issue tracker. Vote for new features on Trello.

comments powered by Disqus