calibrate: Fit a Calibration Line or Curve In EnvStats: Package for Environmental Statistics, Including US EPA Guidance

 calibrate R Documentation

Fit a Calibration Line or Curve

Description

Fit a calibration line or curve based on linear regression.

Usage

calibrate(formula, data, test.higher.orders = TRUE, max.order = 4, p.crit = 0.05,
F.test = "partial", weights, subset, na.action, method = "qr", model = FALSE,
x = FALSE, y = FALSE, contrasts = NULL, warn = TRUE, ...)

Details

A simple and frequently used calibration model is a straight line where the response variable S denotes the signal of the machine and the predictor variable C denotes the true concentration in the physical sample. The error term is assumed to follow a normal distribution with mean 0. Note that the average value of the signal for a blank (C = 0) is the intercept. Other possible calibration models include higher order polynomial models such as a quadratic or cubic model.

In a typical setup, a small number of samples (e.g., n = 6) with known concentrations are measured and the signal is recorded. A sample with no chemical in it, called a blank, is also measured. (You have to be careful to define exactly what you mean by a “blank.” A blank could mean a container from the lab that has nothing in it but is prepared in a similar fashion to containers with actual samples in them. Or it could mean a field blank: the container was taken out to the field and subjected to the same process that all other containers were subjected to, except a physical sample of soil or water was not placed in the container.) Usually, replicate measures at the same known concentrations are taken. (The term “replicate” must be well defined to distinguish between for example the same physical samples that are measured more than once vs. two different physical samples of the same known concentration.)

The function calibrate initially fits a linear calibration model. If the argument max.order is greater than 1, calibrate then performs forward stepwise linear regression to determine the “best” polynomial model.

In the case where replicates are not availble, calibrate uses standard stepwise ANOVA to compare models (Draper and Smith, 1998, p.335). In this case, if the p-value for the partial F-test to compare models is greater than or equal to p.crit, then the model with fewer terms is used as the final model.

In the case where replicates are available, if F.test="lof", then for each model calibrate computes the p-value of the ANOVA for lack-of-fit vs. pure error (Draper and Smith, 1998, Chapters 2; see anovaPE). If the p-value is greater than or equal to p.crit, then this is the final model; otherwise the next higher-order term is added to the polynomial and the model is re-fit. If, during the stepwise procedure, the degrees of freedom associated with the residual sums of squares of a model to be tested is less than or equal to the number of observations minus the number of unique observations, calibrate uses the partial F-test instead of the lack-of-fit F-test.

The stepwise algorithm terminates when either the p-value is greater than or equal to p.crit, or the currently selected model in the algorithm is of order max.order. The algorithm will terminate earlier than this if the next model to be fit includes singularities so that not all coefficients can be estimted.

Value

An object of class "calibrate" that inherits from class "lm" and includes a component called x that stores the model matrix (the values of the predictor variables for the final calibration model).

Note

Almost always the process of determining the concentration of a chemical in a soil, water, or air sample involves using some kind of machine that produces a signal, and this signal is related to the concentration of the chemical in the physical sample. The process of relating the machine signal to the concentration of the chemical is called calibration. Once calibration has been performed, estimated concentrations in physical samples with unknown concentrations are computed using inverse regression (see inversePredictCalibrate). The uncertainty in the process used to estimate the concentration may be quantified with decision, detection, and quantitation limits.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

Draper, N., and H. Smith. (1998). Applied Regression Analysis. Third Edition. John Wiley and Sons, New York, Chapter 3 and p.335.

Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring. Second Edition. John Wiley & Sons, Hoboken. Chapter 6, p. 111.

Helsel, D.R. (2012). Statistics for Censored Environmental Data Using Minitab and R, Second Edition. John Wiley & Sons, Hoboken, New Jersey. Chapter 3, p. 22.

Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with S-PLUS. CRC Press, Boca Raton, FL, pp.562-575.

calibrate.object, anovaPE, inversePredictCalibrate, detectionLimitCalibrate, lm.

Examples

# The data frame EPA.97.cadmium.111.df contains calibration data for
# cadmium at mass 111 (ng/L) that appeared in Gibbons et al. (1997b)
# and were provided to them by the U.S. EPA.
# Display a plot of these data along with the fitted calibration line
# and 99% non-simultaneous prediction limits.  See
# Millard and Neerchal (2001, pp.566-569) for more details on this
# example.

newdata <- data.frame(Spike = seq(min(Spike), max(Spike), len = 100))

pred.list <- predict(calibrate.list, newdata = newdata, se.fit = TRUE)

pointwise.list <- pointwise(pred.list, coverage = 0.99, individual = TRUE)

dev.new()
max(pointwise.list\$upper)), xlab = "True Concentration (ng/L)",
ylab = "Observed Concentration (ng/L)")

abline(calibrate.list, lwd = 2)

lines(newdata\$Spike, pointwise.list\$lower, lty = 8, lwd = 2)

lines(newdata\$Spike, pointwise.list\$upper, lty = 8, lwd = 2)

title(paste("Calibration Line and 99% Prediction Limits",
"for US EPA Cadmium 111 Data", sep = "\n"))

#----------

# Clean up
#---------
rm(Cadmium, Spike, newdata, calibrate.list, pred.list, pointwise.list)
graphics.off()

EnvStats documentation built on Aug. 22, 2023, 5:09 p.m.