nclass: Compute the Number of Classes for a Histogram

 nclass R Documentation

Compute the Number of Classes for a Histogram

Description

Compute the number of classes for a histogram.

Usage

```nclass.Sturges(x)
nclass.scott(x)
nclass.FD(x)
```

Arguments

 `x` a data vector.

Details

`nclass.Sturges` uses Sturges' formula, implicitly basing bin sizes on the range of the data.

`nclass.scott` uses Scott's choice for a normal distribution based on the estimate of the standard error, unless that is zero where it returns `1`.

`nclass.FD` uses the Freedman-Diaconis choice based on the inter-quartile range (`IQR(signif(x, 5))`) unless that's zero where it uses increasingly more extreme symmetric quantiles up to c(1,511)/512 and if that difference is still zero, reverts to using Scott's choice.

Value

The suggested number of classes.

References

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S-PLUS. Springer, page 112.

Freedman, D. and Diaconis, P. (1981). On the histogram as a density estimator: L_2 theory. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete, 57, 453–476. \Sexpr[results=rd,stage=build]{tools:::Rd_expr_doi("10.1007/BF01025868")}.

Scott, D. W. (1979). On optimal and data-based histograms. Biometrika, 66, 605–610. \Sexpr[results=rd,stage=build]{tools:::Rd_expr_doi("10.2307/2335182")}.

Scott, D. W. (1992) Multivariate Density Estimation. Theory, Practice, and Visualization. Wiley.

Sturges, H. A. (1926). The choice of a class interval. Journal of the American Statistical Association, 21, 65–66. \Sexpr[results=rd,stage=build]{tools:::Rd_expr_doi("10.1080/01621459.1926.10502161")}.

`hist` and `truehist` (package MASS); `dpih` (package KernSmooth) for a plugin bandwidth proposed by Wand(1995).

Examples

```set.seed(1)
x <- stats::rnorm(1111)
nclass.Sturges(x)

## Compare them:
NC <- function(x) c(Sturges = nclass.Sturges(x),
Scott = nclass.scott(x), FD = nclass.FD(x))
NC(x)
onePt <- rep(1, 11)
NC(onePt) # no longer gives NaN
```