plotPredIntNormDesign: Plots for a Sampling Design Based on a Prediction Interval...
In EnvStats: Package for Environmental Statistics, Including US EPA Guidance

plotPredIntNormDesign

R Documentation

Plots for a Sampling Design Based on a Prediction Interval for the Next `k` Observations from a Normal Distribution

Description

Create plots involving sample size, number of future observations, half-width, estimated standard deviation, and confidence level for a prediction interval for the next k observations from a normal distribution.

Usage

  plotPredIntNormDesign(x.var = "n", y.var = "half.width", range.x.var = NULL, 
    n = 25, k = 1, n.mean = 1, half.width = 4 * sigma.hat, sigma.hat = 1, 
    method = "Bonferroni", conf.level = 0.95, round.up = FALSE, n.max = 5000, 
    tol = 1e-07, maxiter = 1000, plot.it = TRUE, add = FALSE, n.points = 100, 
    plot.col = "black", plot.lwd = 3 * par("cex"), plot.lty = 1, 
    digits = .Options$digits, cex.main = par("cex"), ..., main = NULL, 
    xlab = NULL, ylab = NULL, type = "l")

Arguments

`x.var`	character string indicating what variable to use for the x-axis. Possible values are `"n"` (sample size; the default), `"half.width"` (the half-width of the confidence interval), `"k"` (number of future observations or averages), `"sigma.hat"` (the estimated standard deviation), and `"conf.level"` (the confidence level).
`y.var`	character string indicating what variable to use for the y-axis. Possible values are `"half.width"` (the half-width of the confidence interval; the default), and `"n"` (sample size).
`range.x.var`	numeric vector of length 2 indicating the range of the x-variable to use for the plot. The default value depends on the value of `x.var`. When `x.var="n"` the default value is `c(2,50)`. When `x.var="half.width"` the default value is `c(2.5 * sigma.hat, 4 * sigma.hat)`. When `x.var="k"` the default value is `c(1, 20)`. When `x.var="sigma.hat"`, the default value is `c(0.1, 2)`. When `x.var="conf.level"`, the default value is `c(0.5, 0.99)`.
`n`	positive integer greater than 1 indicating the sample size upon which the prediction interval is based. The default value is `n=25`. Missing (`NA`), undefined (`NaN`), and infinite (`Inf`, `-Inf`) values are not allowed.
`k`	positive integer specifying the number of future observations or averages the prediction interval should contain with confidence level `conf.level`. The default value is `k=1`. This argument is ignored if `x.var="k"`.
`n.mean`	positive integer specifying the sample size associated with the `k` future averages. The default value is `n.mean=1` (i.e., individual observations). Note that all future averages must be based on the same sample size.
`half.width`	positive scalar indicating the half-widths of the prediction interval. The default value is `half.width=4*sigma.hat`. This argument is ignored if either `x.var="half.width"` or `y.var="half.width"`.
`sigma.hat`	numeric scalar specifying the value of the estimated standard deviation. The default value is `sigma.hat=1`. This argument is ignored if `x.var="sigma.hat"`.
`method`	character string specifying the method to use if the number of future observations (`k`) is greater than 1. The possible values are `method="Bonferroni"` (approximate method based on Bonferonni inequality; the default), and `method="exact"` (exact method due to Dunnett, 1955). This argument is ignored if `k=1`.
`conf.level`	numeric scalar between 0 and 1 indicating the confidence level of the prediction interval. The default value is `conf.level=0.95`.
`round.up`	for the case when `y.var="n"`, logical scalar indicating whether to round up the values of the computed sample sizes to the next smallest integer. The default value is `round.up=TRUE`.
`n.max`	for the case when `y.var="n"`, the maximum possible sample size. The default value is `n.max=5000`.
`tol`	numeric scalar indicating the tolerance to use in the `uniroot` search algorithm. The default value is `tol=1e-7`.
`maxiter`	positive integer indicating the maximum number of iterations to use in the `uniroot` search algorithm. The default value is `maxiter=1000`.
`plot.it`	a logical scalar indicating whether to create a plot or add to the existing plot (see explanation of the argument `add` below) on the current graphics device. If `plot.it=FALSE`, no plot is produced, but a list of (x,y) values is returned (see the section VALUE). The default value is `plot.it=TRUE`.
`add`	a logical scalar indicating whether to add the design plot to the existing plot (`add=TRUE`), or to create a plot from scratch (`add=FALSE`). The default value is `add=FALSE`. This argument is ignored if `plot.it=FALSE`.
`n.points`	a numeric scalar specifying how many (x,y) pairs to use to produce the plot. There are `n.points` x-values evenly spaced between `range.x.var[1]` and `range.x.var[2]`. The default value is `n.points=100`.
`plot.col`	a numeric scalar or character string determining the color of the plotted line or points. The default value is `plot.col="black"`. See the entry for `col` in the help file for `par` for more information.
`plot.lwd`	a numeric scalar determining the width of the plotted line. The default value is `3*par("cex")`. See the entry for `lwd` in the help file for `par` for more information.
`plot.lty`	a numeric scalar determining the line type of the plotted line. The default value is `plot.lty=1`. See the entry for `lty` in the help file for `par` for more information.
`digits`	a scalar indicating how many significant digits to print out on the plot. The default value is the current setting of `options("digits")`.
`cex.main`, `main`, `xlab`, `ylab`, `type`, `...`	additional graphical parameters (see `par`).

Details

See the help files for predIntNorm, predIntNormK, predIntNormHalfWidth, and predIntNormN for information on how to compute a prediction interval for the next k observations or averages from a normal distribution, how the half-width is computed when other quantities are fixed, and how the sample size is computed when other quantities are fixed.

Value

plotPredIntNormDesign invisibly returns a list with components:

`x.var`	x-coordinates of points that have been or would have been plotted.
`y.var`	y-coordinates of points that have been or would have been plotted.

Note

See the help file for predIntNorm.

In the course of designing a sampling program, an environmental scientist may wish to determine the relationship between sample size, confidence level, and half-width if one of the objectives of the sampling program is to produce prediction intervals. The functions predIntNormHalfWidth, predIntNormN, and plotPredIntNormDesign can be used to investigate these relationships for the case of normally-distributed observations.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

See the help file for predIntNorm.

Examples

  # Look at the relationship between half-width and sample size for a 
  # prediction interval for k=1 future observation, assuming an estimated 
  # standard deviation of 1 and a confidence level of 95%:

  dev.new()
  plotPredIntNormDesign()

  #==========

  # Plot sample size vs. the estimated standard deviation for various levels 
  # of confidence, using a half-width of 4:

  dev.new()
  plotPredIntNormDesign(x.var = "sigma.hat", y.var = "n", range.x.var = c(1, 2), 
    ylim = c(0, 90), main = "") 

  plotPredIntNormDesign(x.var = "sigma.hat", y.var = "n", range.x.var = c(1, 2), 
    conf.level = 0.9, add = TRUE, plot.col = "red") 

  plotPredIntNormDesign(x.var = "sigma.hat", y.var = "n", range.x.var = c(1, 2), 
    conf.level = 0.8, add = TRUE, plot.col = "blue") 

  legend("topleft", c("95%", "90%", "80%"), lty = 1, lwd = 3 * par("cex"), 
    col = c("black", "red", "blue"), bty = "n") 

  title(main = paste("Sample Size vs. Sigma Hat for Prediction Interval for", 
    "k=1 Future Obs, Half-Width=4, and Various Confidence Levels", 
    sep = "\n"))

  #==========

  # The data frame EPA.92c.arsenic3.df contains arsenic concentrations (ppb) 
  # collected quarterly for 3 years at a background well and quarterly for 
  # 2 years at a compliance well.  Using the data from the background well, 
  # plot the relationship between half-width and sample size for a two-sided 
  # 90% prediction interval for k=4 future observations.

  EPA.92c.arsenic3.df
  #   Arsenic Year  Well.type
  #1     12.6    1 Background
  #2     30.8    1 Background
  #3     52.0    1 Background
  #...
  #18     3.8    5 Compliance
  #19     2.6    5 Compliance
  #20    51.9    5 Compliance

  mu.hat <- with(EPA.92c.arsenic3.df, 
    mean(Arsenic[Well.type=="Background"])) 

  mu.hat 
  #[1] 27.51667 

  sigma.hat <- with(EPA.92c.arsenic3.df, 
    sd(Arsenic[Well.type=="Background"]))

  sigma.hat 
  #[1] 17.10119 

  dev.new()
  plotPredIntNormDesign(x.var = "n", y.var = "half.width", range.x.var = c(4, 50), 
    k = 4, sigma.hat = sigma.hat, conf.level = 0.9) 

  #==========

  # Clean up
  #---------
  rm(mu.hat, sigma.hat)
  graphics.off()

EnvStats documentation built on June 8, 2025, 11:37 a.m.