iqr: Interquartile Range In EnvStats: Package for Environmental Statistics, Including US EPA Guidance

Description

Compute the interquartile range for a set of data.

Usage

 1  iqr(x, na.rm = FALSE) 

Arguments

 x numeric vector of observations. na.rm logical scalar indicating whether to remove missing values from x. If na.rm=FALSE (the default) and x contains missing values, then a missing value (NA) is returned. If na.rm=TRUE, missing values are removed from x prior to computing the coefficient of variation.

Details

Let \underline{x} denote a random sample of n observations from some distribution associated with a random variable X. The sample interquartile range is defined as:

IQR = \hat{X}_{0.75} - \hat{X}_{0.25} \;\;\;\;\;\; (1)

where X_p denotes the p'th quantile of the distribution and \hat{X}_p denotes the estimate of this quantile (i.e., the sample p'th quantile).

See the R help file for quantile for information on how sample quantiles are computed.

Value

A numeric scalar – the interquartile range.

Note

The interquartile range is a robust estimate of the spread of the distribution. It is the distance between the two ends of a boxplot (see the R help file for boxplot). For a normal distribution with standard deviation σ it can be shown that:

IQR = 1.34898 σ \;\;\;\;\;\; (2)

Author(s)

Steven P. Millard ([email protected])

References

Chambers, J.M., W.S. Cleveland, B. Kleiner, and P.A. Tukey. (1983). Graphical Methods for Data Analysis. Duxbury Press, Boston, MA.

Cleveland, W.S. (1993). Visualizing Data. Hobart Press, Summit, New Jersey.

Helsel, D.R., and R.M. Hirsch. (1992). Statistical Methods in Water Resources Research. Elsevier, New York, NY.

Hirsch, R.M., D.R. Helsel, T.A. Cohn, and E.J. Gilroy. (1993). Statistical Analysis of Hydrologic Data. In: Maidment, D.R., ed. Handbook of Hydrology. McGraw-Hill, New York, Chapter 17, pp.5–7.

Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ.

Summary Statistics, summaryFull, var, sd.

Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41  # Generate 20 observations from a normal distribution with parameters # mean=10 and sd=2, and compute the standard deviation and # interquartile range. # (Note: the call to set.seed simply allows you to reproduce this example.) set.seed(250) dat <- rnorm(20, mean=10, sd=2) sd(dat) #[1] 1.180226 iqr(dat) #[1] 1.489932 #---------- # Repeat the last example, but add a couple of large "outliers" to the # data. Note that the estimated standard deviation is greatly affected # by the outliers, while the interquartile range is not. summaryStats(dat, quartiles = TRUE) # N Mean SD Median Min Max 1st Qu. 3rd Qu. #dat 20 9.8612 1.1802 9.6978 7.6042 11.8756 9.1618 10.6517 new.dat <- c(dat, 20, 50) sd(dat) #[1] 1.180226 sd(new.dat) #[1] 8.79796 iqr(dat) #[1] 1.489932 iqr(new.dat) #[1] 1.851472 #---------- # Clean up rm(dat, new.dat) 

Example output

Attaching package: 'EnvStats'

The following objects are masked from 'package:stats':

predict, predict.lm

The following object is masked from 'package:base':

print.default

[1] 1.180226
[1] 1.489932
N   Mean     SD Median    Min     Max 1st Qu. 3rd Qu.
dat 20 9.8612 1.1802 9.6978 7.6042 11.8756  9.1618 10.6517
[1] 1.180226
[1] 8.79796
[1] 1.489932
[1] 1.851472


EnvStats documentation built on Oct. 10, 2017, 1:05 a.m.