epdfPlot: Plot Empirical Probability Density Function
In EnvStats: Package for Environmental Statistics, Including US EPA Guidance

epdfPlot

R Documentation

Plot Empirical Probability Density Function

Description

Produces an empirical probability density function plot.

Usage

  epdfPlot(x, discrete = FALSE, density.arg.list = NULL, plot.it = TRUE, 
    add = FALSE, epdf.col = "black", epdf.lwd = 3 * par("cex"), epdf.lty = 1, 
    curve.fill = FALSE, curve.fill.col = "cyan", ..., 
    type = ifelse(discrete, "h", "l"), main = NULL, xlab = NULL, ylab = NULL, 
    xlim = NULL, ylim = NULL)

Arguments

`x`	numeric vector of observations. Missing (`NA`), undefined (`NaN`), and infinite (`Inf`, `-Inf`) values are allowed but will be removed.
`discrete`	logical scalar indicating whether the assumed parent distribution of `x` is discrete (`discrete=TRUE`) or continuous (`discrete=FALSE`; the default).
`density.arg.list`	list with arguments to the `density` function. The default value is `density.arg.list=NULL`. This argument is ignored if `discrete=TRUE`.
`plot.it`	logical scalar indicating whether to produce a plot or add to the current plot (see `add`) on the current graphics device. The default value is `plot.it=TRUE`.
`add`	logical scalar indicating whether to add the empirical pdf to the current plot (`add=TRUE`) or generate a new plot (`add=FALSE`; the default). This argument is ignored if `plot.it=FALSE`.
`epdf.col`	a numeric scalar or character string determining the color of the empirical pdf line or points. The default value is `epdf.col="black"`. See the entry for `col` in the help file for `par` for more information.
`epdf.lwd`	a numeric scalar determining the width of the empirical pdf line. The default value is `epdf.lwd=3*par("cex")`. See the entry for `lwd` in the help file for `par` for more information.
`epdf.lty`	a numeric scalar determining the line type of the empirical pdf line. The default value is `ecdf.lty=1`. See the entry for `lty` in the help file for `par` for more information.
`curve.fill`	a logical scalar indicating whether to fill in the area below the empirical pdf curve with the color specified by `curve.fill.col`. The default value is `curve.fill=FALSE`.
`curve.fill.col`	a numeric scalar or character string indicating what color to use to fill in the area below the empirical pdf curve. The default value is `curve.fill.col="cyan"`. This argument is ignored if `curve.fill=FALSE`.
`type`, `main`, `xlab`, `ylab`, `xlim`, `ylim`, `...`	additional graphical parameters (see `lines` and `par`). In particular, the argument `type` specifies the kind of line type. By default, the function `epdfPlot` plots histogram-like vertical lines (`type="h"`) when `discrete=TRUE`, and plots a straight line between points (`type="l"`) when `discrete=FALSE`. The user may override these defaults by supplying the graphics parameter `type` (`type="h"` for histogram-like vertical lines, `type="l"` for linear interpolation, `type="p"` for points only, etc.).

Details

When a distribution is discrete and can only take on a finite number of values, the empirical pdf plot is the same as the standard relative frequency histogram; that is, each bar of the histogram represents the proportion of the sample equal to that particular number (or category). When a distribution is continuous, the function epdfPlot calls the R function density to compute the estimated probability density at a number of evenly spaced points between the minimum and maximum values.

Value

epdfPlot invisibly returns a list with the following components:

`x`	numeric vector of ordered quantiles.
`f.x`	numeric vector of the associated estimated values of the pdf.

Note

An empirical probability density function (epdf) plot is a graphical tool that can be used in conjunction with other graphical tools such as histograms and boxplots to assess the characteristics of a set of data.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

Chambers, J.M., W.S. Cleveland, B. Kleiner, and P.A. Tukey. (1983). Graphical Methods for Data Analysis. Duxbury Press, Boston, MA.

See the REFERENCES section in the help file for density.

Examples

  # Using Reference Area TcCB data in EPA.94b.tccb.df, 
  # create a histogram of the log-transformed observations, 
  # then superimpose the empirical pdf plot.

  dev.new()
  log.TcCB <- with(EPA.94b.tccb.df, log(TcCB[Area == "Reference"]))

  hist(log.TcCB, freq = FALSE, xlim = c(-2, 1),
    col = "cyan", xlab = "log [ TcCB (ppb) ]",
    ylab = "Relative Frequency", 
    main = "Reference Area TcCB with Empirical PDF")

  epdfPlot(log.TcCB, add = TRUE)

  #==========

  # Generate 20 observations from a Poisson distribution with 
  # parameter lambda = 10, and plot the empirical PDF.

  set.seed(875)
  x <- rpois(20, lambda = 10)
  dev.new()
  epdfPlot(x, discrete = TRUE)

  #==========

  # Clean up
  #---------
  rm(log.TcCB, x)
  graphics.off()

EnvStats documentation built on June 8, 2025, 11:37 a.m.