We consider histogram-valued data, i.e., data described by univariate histograms. The methods and the basic statistics for histogram-valued data are mainly based on the L2 Wasserstein metric between distributions, i.e., a Euclidean metric between quantile functions. The package contains unsupervised classification techniques, least square regression and tools for histrogram-valued data and for histogram time series.
~~ An overview of how to use the package, including the most important functions ~~
Antonio Irpino <[email protected]>
Irpino, A., Verde, R. (2015) Basic
statistics for distributional symbolic variables: a new metric-based
approach, Advances in Data Analysis and Classification, Volume 9, Issue 2, pp 143–175.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# Generating a list of distributions a<-vector("list",4) a[]<-distributionH(x=c(80,100,120,135,150,165,180,200,240), p=c(0,0.025,0.1,0.275,0.525,0.725,0.887,0.975,1)) a[]<-distributionH(x=c(80,100,120,135,150,165,180,195,210,240), p=c(0,0.013,0.101,0.255,0.508,0.718,0.895,0.961,0.987,1)) a[]<-distributionH(x=c(95,110,125,140,155,170,185,200,215,230,245), p=c(0,0.012,0.041,0.154,0.36,0.595,0.781,0.929,0.972,0.992,1)) a[]<-distributionH(x=c(105,120,135,150,165,180,195,210,225,240,260), p=c(0,0.009,0.035,0.081,0.186,0.385,0.633,0.832,0.932,0.977,1)) # Generating a list of names of observations namerows<-list( 'u1' , 'u2') # Generating a list of names of variables namevars<-list( 'Var_1' , 'Var_2') # creating the MatH Mat_of_distributions<-MatH(x=a, nrows = 2, ncols = 2, rownames=namerows, varnames=namevars, by.row=FALSE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.