classify: Classification into groups

View source: R/classify.R

classifyR Documentation

Classification into groups

Description

classify continuous values into categories with different methods:
- linearly or logarithmically spaced equal intervals,
- intervals based on quantiles (equally filled bins),
- intervals based on distance from the mean in normal distributions,
- user specified class borders (e.g. for legal or critical limits).

Usage

classify(
  x,
  method = "linear",
  breaks = NULL,
  Range = range(x, finite = TRUE),
  col = NULL,
  sdlab = 1,
  logbase = 1,
  quiet = FALSE,
  ...
)

Arguments

x

Vector with numeric values

method

Character string (partial matching is performed). Classification method (type of binning) to compute the class breakpoints. See section Details. DEFAULT: "linear"

breaks

Specification for method, see Details. DEFAULT: NULL (different defaults for each method)

Range

Ends of intervals. DEFAULT: range(x, finite=TRUE)

col

Function that will return a color palette, e.g. seqPal. If given, a vector of colors is returned instead of the regular list. DEFAULT: NULL (ignored)

sdlab

Type of label and breakpoints if method=standarddeviation. 1 means -0.5 sd, 0.5 sd, 2 means -1 sd, mean, 1 sd, 3 means actual numbers for type 1, 4 means numbers for type 2. DEFAULT: 1

logbase

base for logSpaced. Used only if not 1 and method="log". DEFAULT: 1

quiet

Suppress warnings, eg for values outside Range? DEFAULT: FALSE

...

Further arguments passed to the function col.

Details

Binning methods are explained very nicely in the link in the section References.
nbins indicates the number of classes (and thus, colors).

method | explanation | meaning of breaks | default
---------- | ----------- | ----------- | -------
linear | nbins equally spaced classes | nbins | 100
log | nbins logarithmically spaced | nbins | 100
quantile | classes have equal number of values | the quantiles (or number of them) | 0:4/4
sd | normal distributions | number of sd in one direction from the mean | 3
custom | user-given breakpoints | breakpoint values (including ends of Range) | none

The default is set to equalinterval which makes sense for my original intent of plotting lake depth (bathymetry measured at irregularly distributed points) on a linear color scale.
This is the workhorse for colPoints.

Value

if col=NULL, a list with class numbers (index) and other elements for colPoints. If col is a palette function, a vector of colors.

Author(s)

Berry Boessenkool, berry-b@gmx.de, 2014

References

See this page on the effect of classification (binning) methods:
http://uxblog.idvsolutions.com/2011/10/telling-truth.html

See Also

colPoints

Examples


classify( c(1:10, 20), "lin", breaks=12)
classify( c(1:10, 20), "q", breaks=0:10/10)
classify( c(1:10, 20), "s", sdlab=2 )
classify( c(1:10, 20), "s", sdlab=1, breaks=2 )
classify( c(1:10, 20), "c", breaks=c(5,27) )
classify( c(1:10, 20), "log")

cols <- classify( c(1:10, 20), col=seqPal) ; cols
plot(c(1:10, 20), col=cols, pch=16, cex=2)

set.seed(42); rz <- rnorm(30, mean=350, sd=120)
plot(1)
classleg <- function(method="linear", breaks=100, sdlab=1, logbase=1, ...)
           do.call(colPointsLegend, owa(
           classify(rz, method=method, breaks=breaks, sdlab=sdlab, logbase=logbase),
           list(z=rz, title="", ...))   )
classleg(br=3, met="s", col=divPal(5),mar=c(0,3,1,0),hor=FALSE,x1=0.1,x2=0.25)
classleg(br=3, met="s", col=divPal(6),mar=c(0,3,1,0),hor=FALSE,x1=0.25,x2=0.4, sdlab=2)
classleg(y1=0.85, y2=1)
classleg(br=20, met="log", y1=0.70, y2=0.85)
classleg(br=20, met="log", y1=0.55, y2=0.70, logbase=1.15)
classleg(br=20, met="log", y1=0.45, y2=0.60, logbase=0.90)
classleg(br= 5, met="q", y1=0.30, y2=0.45)# quantiles: each color is equally often used
classleg(met="q", y1=0.15, y2=0.30, breaks=0:15/15, at=pretty2(rz), labels=pretty2(rz) )


berryFunctions documentation built on May 29, 2024, 4:01 a.m.