# eqgamma: Estimate Quantiles of a Gamma Distribution In EnvStats: Package for Environmental Statistics, Including US EPA Guidance

## Description

Estimate quantiles of a gamma distribution, and optionally construct a confidence interval for a quantile.

## Usage

 ```1 2 3 4 5 6 7``` ``` eqgamma(x, p = 0.5, method = "mle", ci = FALSE, ci.type = "two-sided", conf.level = 0.95, normal.approx.transform = "kulkarni.powar", digits = 0) eqgammaAlt(x, p = 0.5, method = "mle", ci = FALSE, ci.type = "two-sided", conf.level = 0.95, normal.approx.transform = "kulkarni.powar", digits = 0) ```

## Arguments

 `x` a numeric vector of observations, or an object resulting from a call to an estimating function that assumes a gamma distribution (e.g., `egamma` or `egammaAlt`). If `ci=TRUE` then `x` must be a numeric vector of observations. If `x` is a numeric vector, missing (`NA`), undefined (`NaN`), and infinite (`Inf`, `-Inf`) values are allowed but will be removed. `p` numeric vector of probabilities for which quantiles will be estimated. All values of `p` must be between 0 and 1. When `ci=TRUE`, `p` must be a scalar. The default value is `p=0.5`. `method` character string specifying the method to use to estimate the shape and scale parameters of the distribution. The possible values are `"mle"` (maximum likelihood; the default), `"bcmle"` (bias-corrected mle), `"mme"` (method of moments), and `"mmue"` (method of moments based on the unbiased estimator of variance). See the DETAILS section of the help file for `egamma` for more information. `ci` logical scalar indicating whether to compute a confidence interval for the quantile. The default value is `ci=FALSE`. `ci.type` character string indicating what kind of confidence interval for the quantile to compute. The possible values are `"two-sided"` (the default), `"lower"`, and `"upper"`. This argument is ignored if `ci=FALSE`. `conf.level` a scalar between 0 and 1 indicating the confidence level of the confidence interval. The default value is `conf.level=0.95`. This argument is ignored if `ci=FALSE`. `normal.approx.transform` character string indicating which power transformation to use. Possible values are `"kulkarni.powar"` (the default), `"cube.root"`, and `"fourth.root"`. See the DETAILS section for more informaiton. This argument is ignored if `ci=FALSE`. `digits` an integer indicating the number of decimal places to round to when printing out the value of `100*p`. The default value is `digits=0`.

## Details

The function `eqgamma` returns estimated quantiles as well as estimates of the shape and scale parameters. The function `eqgammaAlt` returns estimated quantiles as well as estimates of the mean and coefficient of variation.

Quantiles are estimated by 1) estimating the shape and scale parameters by calling `egamma`, and then 2) calling the function `qgamma` and using the estimated values for shape and scale.

The confidence interval for a quantile is computed by:

1. using a power transformation on the original data to induce approximate normality,

2. using `eqnorm` to compute the confidence interval, and then

3. back-transforming the interval to create a confidence interval on the original scale.

This is similar to what is done to create tolerance intervals for a gamma distribuiton (Krishnamoorthy et al., 2008), and there is a one-to-one relationship between confidence intervals for a quantile and tolerance intervals (see the DETAILS section of the help file for `eqnorm`). The value `normal.approx.transform="cube.root"` uses the cube root transformation suggested by Wilson and Hilferty (1931) and used by Krishnamoorthy et al. (2008) and Singh et al. (2010b), and the value `normal.approx.transform="fourth.root"` uses the fourth root transformation suggested by Hawkins and Wixley (1986) and used by Singh et al. (2010b). The default value `normal.approx.transform="kulkarni.powar"` uses the “Optimum Power Normal Approximation Method” of Kulkarni and Powar (2010). The “optimum” power r is determined by:

 r = -0.0705 - 0.178 \, shape + 0.475 \, √{shape} if shape ≤ 1.5 r = 0.246 if shape > 1.5

where shape denotes the estimate of the shape parameter. Although Kulkarni and Powar (2010) use the maximum likelihood estimate of shape to determine the power r, for the functions `eqgamma` and `eqgammaAlt` the power r is based on whatever estimate of shape is used
(e.g., `method="mle"`, `method="bcmle"`, etc.).

## Value

If `x` is a numeric vector, `eqgamma` and `eqgammaAlt` return a list of class `"estimate"` containing the estimated quantile(s) and other information. See `estimate.object` for details.

If `x` is the result of calling an estimation function, `eqgamma` and `eqgammaAlt` return a list whose class is the same as `x`. The list contains the same components as `x`, as well as components called `quantiles` and `quantile.method`. In addition, if `ci=TRUE`, the returned list contains a component called `interval` containing the confidence interval information. If `x` already has a component called `interval`, this component is replaced with the confidence interval information.

## Note

The gamma distribution takes values on the positive real line. Special cases of the gamma are the exponential distribution and the chi-square distributions. Applications of the gamma include life testing, statistical ecology, queuing theory, inventory control, and precipitation processes. A gamma distribution starts to resemble a normal distribution as the shape parameter a tends to infinity.

Some EPA guidance documents (e.g., Singh et al., 2002; Singh et al., 2010a,b) strongly recommend against using a lognormal model for environmental data and recommend trying a gamma distribuiton instead.

Percentiles are sometimes used in environmental standards and regulations. For example, Berthouex and Brown (2002, p.71) note that England has water quality limits based on the 90th and 95th percentiles of monitoring data not exceeding specified levels. They also note that the U.S. EPA has specifications for air quality monitoring, aquatic standards on toxic chemicals, and maximum daily limits for industrial effluents that are all based on percentiles. Given the importance of these quantities, it is essential to characterize the amount of uncertainty associated with the estimates of these quantities. This is done with confidence intervals.

## Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

## References

Berthouex, P.M., and L.C. Brown. (2002). Statistics for Environmental Engineers. Lewis Publishers, Boca Raton.

Conover, W.J. (1980). Practical Nonparametric Statistics. Second Edition. John Wiley and Sons, New York.

Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.

Gibbons, R.D., D.K. Bhaumik, and S. Aryal. (2009). Statistical Methods for Groundwater Monitoring, Second Edition. John Wiley & Sons, Hoboken.

Hawkins, D. M., and R.A.J. Wixley. (1986). A Note on the Transformation of Chi-Squared Variables to Normality. The American Statistician, 40, 296–298.

Johnson, N.L., S. Kotz, and N. Balakrishnan. (1994). Continuous Univariate Distributions, Volume 1. Second Edition. John Wiley and Sons, New York, Chapter 17.

Krishnamoorthy K., T. Mathew, and S. Mukherjee. (2008). Normal-Based Methods for a Gamma Distribution: Prediction and Tolerance Intervals and Stress-Strength Reliability. Technometrics, 50(1), 69–78.

Krishnamoorthy K., and T. Mathew. (2009). Statistical Tolerance Regions: Theory, Applications, and Computation. John Wiley and Sons, Hoboken.

Kulkarni, H.V., and S.K. Powar. (2010). A New Method for Interval Estimation of the Mean of the Gamma Distribution. Lifetime Data Analysis, 16, 431–447.

Singh, A., A.K. Singh, and R.J. Iaci. (2002). Estimation of the Exposure Point Concentration Term Using a Gamma Distribution. EPA/600/R-02/084. October 2002. Technology Support Center for Monitoring and Site Characterization, Office of Research and Development, Office of Solid Waste and Emergency Response, U.S. Environmental Protection Agency, Washington, D.C.

Singh, A., R. Maichle, and N. Armbya. (2010a). ProUCL Version 4.1.00 User Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

Singh, A., N. Armbya, and A. Singh. (2010b). ProUCL Version 4.1.00 Technical Guide (Draft). EPA/600/R-07/041, May 2010. Office of Research and Development, U.S. Environmental Protection Agency, Washington, D.C.

Wilson, E.B., and M.M. Hilferty. (1931). The Distribution of Chi-Squares. Proceedings of the National Academy of Sciences, 17, 684–688.

USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R-09-007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.

USEPA. (2010). Errata Sheet - March 2009 Unified Guidance. EPA 530/R-09-007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.

## See Also

`egamma`, `GammaDist`, `estimate.object`, `eqnorm`, `tolIntGamma`.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150``` ``` # Generate 20 observations from a gamma distribution with parameters # shape=3 and scale=2, then estimate the 90th percentile and create # a one-sided upper 95% confidence interval for that percentile. # (Note: the call to set.seed simply allows you to reproduce this # example.) set.seed(250) dat <- rgamma(20, shape = 3, scale = 2) eqgamma(dat, p = 0.9, ci = TRUE, ci.type = "upper") #Results of Distribution Parameter Estimation #-------------------------------------------- # #Assumed Distribution: Gamma # #Estimated Parameter(s): shape = 2.203862 # scale = 2.174928 # #Estimation Method: mle # #Estimated Quantile(s): 90'th %ile = 9.113446 # #Quantile Estimation Method: Quantile(s) Based on # mle Estimators # #Data: dat # #Sample Size: 20 # #Confidence Interval for: 90'th %ile # #Confidence Interval Method: Exact using # Kulkarni & Powar (2010) # transformation to Normality # based on mle of 'shape' # #Confidence Interval Type: upper # #Confidence Level: 95% # #Confidence Interval: LCL = 0.00000 # UCL = 13.79733 #---------- # Compare these results with the true 90'th percentile: qgamma(p = 0.9, shape = 3, scale = 2) # 10.64464 #---------- # Using the same data as in the previous example, use egammaAlt # to estimate the mean and cv based on the bias-corrected # estimate of shape, and use the cube-root transformation to # normality. eqgammaAlt(dat, p = 0.9, method = "bcmle", ci = TRUE, ci.type = "upper", normal.approx.transform = "cube.root") #Results of Distribution Parameter Estimation #-------------------------------------------- # #Assumed Distribution: Gamma # #Estimated Parameter(s): mean = 4.7932408 # cv = 0.7242165 # #Estimation Method: bcmle of 'shape' # #Estimated Quantile(s): 90'th %ile = 9.428 # #Quantile Estimation Method: Quantile(s) Based on # bcmle of 'shape' # #Data: dat # #Sample Size: 20 # #Confidence Interval for: 90'th %ile # #Confidence Interval Method: Exact using # Wilson & Hilferty (1931) cube-root # transformation to Normality # #Confidence Interval Type: upper # #Confidence Level: 95% # #Confidence Interval: LCL = 0.00000 # UCL = 12.89643 #---------- # Clean up rm(dat) #-------------------------------------------------------------------- # Example 17-3 of USEPA (2009, p. 17-17) shows how to construct a # beta-content upper tolerance limit with 95% coverage and # 95% confidence using chrysene data and assuming a lognormal # distribution. Here we will use the same chrysene data but assume a # gamma distribution. # A beta-content upper tolerance limit with 95% coverage and # 95% confidence is equivalent to the 95% upper confidence limit for # the 95th percentile. attach(EPA.09.Ex.17.3.chrysene.df) Chrysene <- Chrysene.ppb[Well.type == "Background"] eqgamma(Chrysene, p = 0.95, ci = TRUE, ci.type = "upper") #Results of Distribution Parameter Estimation #-------------------------------------------- # #Assumed Distribution: Gamma # #Estimated Parameter(s): shape = 2.806929 # scale = 5.286026 # #Estimation Method: mle # #Estimated Quantile(s): 95'th %ile = 31.74348 # #Quantile Estimation Method: Quantile(s) Based on # mle Estimators # #Data: Chrysene # #Sample Size: 8 # #Confidence Interval for: 95'th %ile # #Confidence Interval Method: Exact using # Kulkarni & Powar (2010) # transformation to Normality # based on mle of 'shape' # #Confidence Interval Type: upper # #Confidence Level: 95% # #Confidence Interval: LCL = 0.00000 # UCL = 69.32425 #---------- # Clean up rm(Chrysene) detach("EPA.09.Ex.17.3.chrysene.df") ```

### Example output

```Attaching package: 'EnvStats'

The following objects are masked from 'package:stats':

predict, predict.lm

The following object is masked from 'package:base':

print.default

Results of Distribution Parameter Estimation
--------------------------------------------

Assumed Distribution:            Gamma

Estimated Parameter(s):          shape = 2.203862
scale = 2.174928

Estimation Method:               MLE

Estimated Quantile(s):           90'th %ile = 9.113446

Quantile Estimation Method:      Quantile(s) Based on
MLE Estimators

Data:                            dat

Sample Size:                     20

Confidence Interval for:         90'th %ile

Confidence Interval Method:      Exact using
Kulkarni & Powar (2010)
transformation to Normality
based on mle of 'shape'

Confidence Interval Type:        upper

Confidence Level:                95%

Confidence Interval:             LCL =  0.00000
UCL = 13.79733

 10.64464

Results of Distribution Parameter Estimation
--------------------------------------------

Assumed Distribution:            Gamma

Estimated Parameter(s):          mean = 4.7932408
cv   = 0.7242165

Estimation Method:               Bias-Corrected MLE

Estimated Quantile(s):           90'th %ile = 9.427999

Quantile Estimation Method:      Quantile(s) Based on
Bias-Corrected MLE

Data:                            dat

Sample Size:                     20

Confidence Interval for:         90'th %ile

Confidence Interval Method:      Exact using
Wilson & Hilferty (1931) cube-root
transformation to Normality

Confidence Interval Type:        upper

Confidence Level:                95%

Confidence Interval:             LCL =  0.00000
UCL = 12.89643

Results of Distribution Parameter Estimation
--------------------------------------------

Assumed Distribution:            Gamma

Estimated Parameter(s):          shape = 2.806929
scale = 5.286026

Estimation Method:               MLE

Estimated Quantile(s):           95'th %ile = 31.74348

Quantile Estimation Method:      Quantile(s) Based on
MLE Estimators

Data:                            Chrysene

Sample Size:                     8

Confidence Interval for:         95'th %ile

Confidence Interval Method:      Exact using
Kulkarni & Powar (2010)
transformation to Normality
based on mle of 'shape'

Confidence Interval Type:        upper

Confidence Level:                95%

Confidence Interval:             LCL =  0.00000
UCL = 69.32425
```

EnvStats documentation built on Oct. 23, 2020, 6:41 p.m.