conf2d: Bivariate (Two-Dimensional) Confidence Region
In r2d2: Bivariate (Two-Dimensional) Confidence Region and Frequency Distribution

View source: R/conf2d.R

conf2d

R Documentation

Bivariate (Two-Dimensional) Confidence Region

Description

Calculate an empirical confidence region for two variables, and optionally overlay the smooth polygon on a scatterplot.

Usage

conf2d(x, ...)

## S3 method for class 'formula'
conf2d(formula, data, subset, ...)

## Default S3 method:
conf2d(x, y, level=0.95, n=200, method="wand", shape=1, smooth=50,
       plot=TRUE, add=FALSE, xlab=NULL, ylab=NULL, col.points="gray",
       col="black", lwd=2, ...)

conf2d_int(x, y, surf, level, n)  # internal function

Arguments

`x`	a vector of x values, or a data frame whose first two columns contain the x and y values.
`y`	a vector of y values.
`formula`	a `formula`, such as `y~x`.
`data`	a `data.frame`, `matrix`, or `list` from which the variables in `formula` should be taken.
`subset`	an optional vector specifying a subset of observations to be used.
`level`	the proportion of points that should be inside the region.
`n`	the number of regions to evaluate, before choosing the region that matches `level` best.
`method`	kernel smoothing function to use: `"wand"` or `"mass"`.
`shape`	a bandwidth scaling factor, affecting the polygon shape.
`smooth`	the number of bins (scalar or vector of length 2), affecting the polygon smoothness.
`plot`	whether to plot a scatterplot and overlay the region as a polygon.
`add`	whether to add a polygon to an existing plot.
`xlab`	a label for the x axis.
`ylab`	a label for the y axis.
`col.points`	color of points.
`col`	color of polygon.
`lwd`	line width of polygon.
`...`	further arguments passed to `plot` and `polygon`.
`surf`	a list whose first three elements are x coordinates, y coordinates, and a surface matrix.

Details

This function constructs a large number (n) of smooth polygons, and then chooses the polygon that comes closest to containing a given proportion (level) of the total points.

The default method="wand" calls the bkde2D kernel smoother from the KernSmooth package, while method="mass" calls kde2d from the MASS package.

The conf2d function calls bkde2D or kde2d to compute a smooth surface from x and y. If users already have a smoothed surface to work from, the internal conf2d_int can be used directly to find the empirical confidence region that matches level best.

Value

List containing five elements:

`x`	x coordinates defining the region.
`y`	y coordinates defining the region.
`inside`	logical vector indicating which of the original data coordinates are inside the region.
`area`	area inside the region.
`prop`	actual proportion of points inside the region.

Note

The area of a bivariate region is analogous to the range of a univariate interval. This allows a quantitative comparison of different confidence regions.

Ellipses are a more restrictive approach to calculate an empirical bivariate confidence region. Smooth polygons make fewer assumptions about how x and y covary.

The conf2d and freq2d functions are closely related. The advantage of conf2d is that it returns a region as a smooth polygon. The advantage of freq2d is that it returns a set that is guaranteed to contain the correct proportion of points, even for spatially complex datasets.

Author(s)

Arni Magnusson and Julian Burgos, based on an earlier function by Gregory R. Warnes.

Examples

conf2d(Ushape)$prop
conf2d(saithe, pch=16, cex=1.2, col.points=rgb(0,0,0,0.1), lwd=3)

# First surface, then region
plot(saithe, col="gray")
surf <- MASS::kde2d(saithe$Bio, saithe$HR, h=0.25, n=100)
region <- conf2d_int(saithe$Bio, saithe$HR, surf, level=0.95, n=200)
polygon(region, lwd=2)

r2d2 documentation built on Oct. 22, 2024, 9:07 a.m.