# epois: Estimate Parameter of a Poisson Distribution In EnvStats: Package for Environmental Statistics, Including US EPA Guidance

## Description

Estimate the mean of a Poisson distribution, and optionally construct a confidence interval for the mean.

## Usage

 1 2  epois(x, method = "mle/mme/mvue", ci = FALSE, ci.type = "two-sided", ci.method = "exact", conf.level = 0.95) 

## Arguments

 x numeric vector of observations. method character string specifying the method of estimation. Currently the only possible value is "mle/mme/mvue" (maximum likelihood/method of moments/minimum variance unbiased; the default). See the DETAILS section for more information. ci logical scalar indicating whether to compute a confidence interval for the location or scale parameter. The default value is FALSE. ci.type character string indicating what kind of confidence interval to compute. The possible values are "two-sided" (the default), "lower", and "upper". This argument is ignored if ci=FALSE. ci.method character string indicating what method to use to construct the confidence interval for the location or scale parameter. Possible values are "exact" (the default), "pearson.hartley.approx" (Pearson-Hartley approximation), and "normal.approx" (normal approximation). See the DETAILS section for more information. This argument is ignored if ci=FALSE. conf.level a scalar between 0 and 1 indicating the confidence level of the confidence interval. The default value is conf.level=0.95. This argument is ignored if ci=FALSE.

## Details

If x contains any missing (NA), undefined (NaN) or infinite (Inf, -Inf) values, they will be removed prior to performing the estimation.

Let \underline{x} = (x_1, x_2, …, x_n) be a vector of n observations from a Poisson distribution with parameter lambda=λ. It can be shown (e.g., Forbes et al., 2009) that if y is defined as:

y = ∑_{i=1}^n x_i \;\;\;\; (1)

then y is an observation from a Poisson distribution with parameter lambda=n λ.

Estimation
The maximum likelihood, method of moments, and minimum variance unbiased estimator (mle/mme/mvue) of λ is given by:

\hat{λ} = \bar{x} \;\;\;\; (2)

where

\bar{x} = \frac{1}{n} ∑_{i=1}^n x_i = \frac{y}{n} \;\;\;\; (3)

Confidence Intervals
There are three possible ways to construct a confidence interval for λ: based on the exact distribution of the estimator of λ (ci.type="exact"), based on an approximation of Pearson and Hartley (ci.type="pearson.hartley.approx"), or based on the normal approximation
(ci.type="normal.approx").

Exact Confidence Interval (ci.method="exact")
If ci.type="two-sided", an exact (1-α)100\% confidence interval for λ can be constructed as [LCL, UCL], where the confidence limits are computed such that:

Pr[Y ≥ y \| λ = LCL] = \frac{α}{2} \;\;\;\; (4)

Pr[Y ≤ y \| λ = UCL] = \frac{α}{2} \;\;\;\; (5)

where y is defined in equation (1) and Y denotes a Poisson random variable with parameter lambda=n λ.

If ci.type="lower", α/2 is replaced with α in equation (4) and UCL is set to .

If ci.type="upper", α/2 is replaced with α in equation (5) and LCL is set to 0.

Note that an exact upper confidence bound can be computed even when all observations are 0.

Pearson-Hartley Approximation (ci.method="pearson.hartley.approx")
For a two-sided (1-α)100\% confidence interval for λ, the Pearson and Hartley approximation (Zar, 2010, p.587; Pearson and Hartley, 1970, p.81) is given by:

[\frac{χ^2_{2n\bar{x}, α/2}}{2n}, \frac{χ^2_{2n\bar{x} + 2, 1 - α/2}}{2n}] \;\;\;\; (6)

where χ^2_{ν, p} denotes the p'th quantile of the chi-square distribution with ν degrees of freedom. One-sided confidence intervals are computed in a similar fashion.

Normal Approximation (ci.method="normal.approx") An approximate (1-α)100\% confidence interval for λ can be constructed assuming the distribution of the estimator of λ is approximately normally distributed. A two-sided confidence interval is constructed as:

[\hat{λ} - z_{1-α/2} \hat{σ}_{\hat{λ}}, \hat{λ} + z_{1-α/2} \hat{σ}_{\hat{λ}}] \;\;\;\; (7)

where z_p is the p'th quantile of the standard normal distribution, and the quantity

\hat{σ}_{\hat{λ}} = √{\hat{λ} / n} \;\;\;\; (8)

denotes the estimated asymptotic standard deviation of the estimator of λ.

One-sided confidence intervals are constructed in a similar manner.

## Value

a list of class "estimate" containing the estimated parameters and other information.
See estimate.object for details.

## Note

The Poisson distribution is named after Poisson, who derived this distribution as the limiting distribution of the binomial distribution with parameters size=N and prob=p, where N tends to infinity, p tends to 0, and Np stays constant.

In this context, the Poisson distribution was used by Bortkiewicz (1898) to model the number of deaths (per annum) from kicks by horses in Prussian Army Corps. In this case, p, the probability of death from this cause, was small, but the number of soldiers exposed to this risk, N, was large.

The Poisson distribution has been applied in a variety of fields, including quality control (modeling number of defects produced in a process), ecology (number of organisms per unit area), and queueing theory. Gibbons (1987b) used the Poisson distribution to model the number of detected compounds per scan of the 32 volatile organic priority pollutants (VOC), and also to model the distribution of chemical concentration (in ppb).

## Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

## References

Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.

Gibbons, R.D. (1987b). Statistical Models for the Analysis of Volatile Organic Compounds in Waste Disposal Sites. Ground Water 25, 572-580.

Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.

Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 4.

Pearson, E.S., and H.O. Hartley, eds. (1970). Biometrika Tables for Statisticians, Volume 1. Cambridge Universtiy Press, New York, p.81.

Zar, J.H. (2010). Biostatistical Analysis. Fifth Edition. Prentice-Hall, Upper Saddle River, NJ, pp. 585–586.

Poisson.

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54  # Generate 20 observations from a Poisson distribution with parameter # lambda=2, then estimate the parameter and construct a 90% confidence # interval. # (Note: the call to set.seed simply allows you to reproduce this example.) set.seed(250) dat <- rpois(20, lambda = 2) epois(dat, ci = TRUE, conf.level = 0.9) #Results of Distribution Parameter Estimation #-------------------------------------------- # #Assumed Distribution: Poisson # #Estimated Parameter(s): lambda = 1.8 # #Estimation Method: mle/mme/mvue # #Data: dat # #Sample Size: 20 # #Confidence Interval for: lambda # #Confidence Interval Method: exact # #Confidence Interval Type: two-sided # #Confidence Level: 90% # #Confidence Interval: LCL = 1.336558 # UCL = 2.377037 #---------- # Compare the different ways of constructing confidence intervals for # lambda using the same data as in the previous example: epois(dat, ci = TRUE, ci.method = "pearson", conf.level = 0.9)$interval$limits # LCL UCL #1.336558 2.377037 epois(dat, ci = TRUE, ci.method = "normal.approx", conf.level = 0.9)$interval$limits # LCL UCL #1.306544 2.293456 #---------- # Clean up #--------- rm(dat) 

### Example output

Attaching package: 'EnvStats'

The following objects are masked from 'package:stats':

predict, predict.lm

The following object is masked from 'package:base':

print.default

Results of Distribution Parameter Estimation
--------------------------------------------

Assumed Distribution:            Poisson

Estimated Parameter(s):          lambda = 1.8

Estimation Method:               mle/mme/mvue

Data:                            dat

Sample Size:                     20

Confidence Interval for:         lambda

Confidence Interval Method:      exact

Confidence Interval Type:        two-sided

Confidence Level:                90%

Confidence Interval:             LCL = 1.336558
UCL = 2.377037

LCL      UCL
1.336558 2.377037
LCL      UCL
1.306544 2.293456


EnvStats documentation built on Oct. 23, 2020, 6:41 p.m.