overlap: Estimate the overlapping measure.

View source: R/overlap.R

overlapR Documentation

Estimate the overlapping measure.

Description

It returns the overlapped estimated area between two or more kernel density estimations from empirical data. The overlapping measure can be computed either as the integral of the minimum between two densities (type = "1") or as the proportion of overlapping area between two densities (type = "2"). In the last case, the integral of the minimum between two densities is divided by the integral of the maximum of the two densities.

Usage

overlap( x, nbins = 1024, type = c( "1", "2" ), 
    pairsOverlap = TRUE, plot = FALSE, boundaries = NULL, 
    get_xpoints = FALSE, ... )

Arguments

x

a list of numerical vectors to be compared (each vector is an element of the list).

nbins

number of equally spaced points through which the density estimates are compared; see density for details.

type

character, type of index. If type = "2" returns the proportion of the overlapped area between two or more densities, see Details.

pairsOverlap

logical, if TRUE (default) returns the overlapped area relative to each pair of distributions.

plot

logical, if TRUE, the final plot of estimated densities and overlapped areas is produced.

boundaries

an optional vector indicating the minimum and the maximum over a predefined subset of the support of the empirical densities, see Details.

get_xpoints

logical, if TRUE returns a vector where the abscissas represent the points of intersection among the densities. Note: it works only if pairsOverlap = FALSE.

...

optional arguments to be passed to the function density.

Details

When dealing with two densities: type = "1" corresponds to the integral of the minimum between the two densities; type = "2" corresponds to the proportion of the overlapped area over the total area.

If the list x contains more than two elements (i.e. more than two distributions) it computes both the multiple and the pairwise overlapping among all distributions.

If plot = TRUE all the overlapped areas are plotted. It requires ggplot2.

The optional vector boundaries has to contain two numbers for the empirical minimum and maximum of the overlapped area. See examples below.

Value

It returns a list containing the following components:

OV

estimate of the overlapped area; if x contains more than two elements then a vector of estimates is returned.

xpoints

a list of intersection points (in abscissa) among the densities (if get_xpoints = TRUE).

OVpairs

the estimates of overlapped areas for each pair of densities (only if x contains more than two elements).

Note

Call function ovmult.

Author(s)

Massimiliano Pastore, Pierfrancesco Alaimo Di Loro, Marco Mingione

References

Pastore, M. (2018). Overlapping: a R package for Estimating Overlapping in Empirical Distributions. The Journal of Open Source Software, 3 (32), 1023. doi: 10.21105/joss.01023

Pastore, M., Calcagnì, A. (2019). Measuring Distribution Similarities Between Samples: A Distribution-Free Overlapping Index. Frontiers in Psychology, 10:1089. doi: 10.3389/fpsyg.2019.01089

Examples

set.seed(20150605)
x <- list(X1=rnorm(100), X2=rt(50,8), X3=rchisq(80,2))
overlap(x, plot=TRUE)

# including boundaries
x <- list(X1=runif(100), X2=runif(100,.5,1))
overlap(x, plot=TRUE, boundaries=c(.5,1))

x <- list(X1=runif(100), X2=runif(50), X3=runif(30))
overlap(x, plot=TRUE, boundaries=c(.1,.9))

# changing kernel
overlap(x, plot=TRUE, kernel="rectangular")

# normalized overlap
N <- 1e5
x <- list(X1=runif(N),X2=runif(N,.5))
overlap(x)
overlap(x, type = "2")



overlapping documentation built on Dec. 28, 2022, 2:13 a.m.