Estimate Quantiles of a Hypergeometric Distribution

Share:

Description

Estimate quantiles of a hypergeometric distribution.

Usage

1
  eqhyper(x, m = NULL, total = NULL, k = NULL, p = 0.5, method = "mle", digits = 0)

Arguments

x

non-negative integer indicating the number of white balls out of a sample of size k drawn without replacement from the urn, or an object resulting from a call to an estimating function that assumes a hypergeometric distribution (e.g., ehyper). Missing (NA), undefined (NaN), and infinite (Inf, -Inf) values are not allowed.

m

non-negative integer indicating the number of white balls in the urn. You must supply m or total, but not both. Missing values (NAs) are not allowed.

total

positive integer indicating the total number of balls in the urn (i.e., m+n). You must supply m or total, but not both. Missing values (NAs) are not allowed.

k

positive integer indicating the number of balls drawn without replacement from the urn. Missing values (NAs) are not allowed.

p

numeric vector of probabilities for which quantiles will be estimated. All values of p must be between 0 and 1. The default value is p=0.5.

method

character string specifying the method of estimating the parameters of the hypergeometric distribution. Possible values are "mle" (maximum likelihood; the default) and "mvue" (minimum variance unbiased). The mvue method is only available when you are estimating m (i.e., when you supply the argument total). See the DETAILS section of the help file for ehyper for more information on these estimation methods.

digits

an integer indicating the number of decimal places to round to when printing out the value of 100*p. The default value is digits=0.

Details

The function eqhyper returns estimated quantiles as well as estimates of the hypergeometric distribution parameters.

Quantiles are estimated by 1) estimating the distribution parameters by calling ehyper, and then 2) calling the function qhyper and using the estimated values for the distribution parameters.

Value

If x is a numeric vector, eqhyper returns a list of class "estimate" containing the estimated quantile(s) and other information. See estimate.object for details.

If x is the result of calling an estimation function, eqhyper returns a list whose class is the same as x. The list contains the same components as x, as well as components called quantiles and quantile.method.

Note

The hypergeometric distribution can be described by an urn model with M white balls and N black balls. If K balls are drawn with replacement, then the number of white balls in the sample of size K follows a binomial distribution with parameters size=K and prob=M/(M+N). If K balls are drawn without replacement, then the number of white balls in the sample of size K follows a hypergeometric distribution with parameters m=M, n=N, and k=K.

The name “hypergeometric” comes from the fact that the probabilities associated with this distribution can be written as successive terms in the expansion of a function of a Gaussian hypergeometric series.

The hypergeometric distribution is applied in a variety of fields, including quality control and estimation of animal population size. It is also the distribution used to compute probabilities for Fishers's exact test for a 2x2 contingency table.

Author(s)

Steven P. Millard (EnvStats@ProbStatInfo.com)

References

Forbes, C., M. Evans, N. Hastings, and B. Peacock. (2011). Statistical Distributions. Fourth Edition. John Wiley and Sons, Hoboken, NJ.

Johnson, N. L., S. Kotz, and A. Kemp. (1992). Univariate Discrete Distributions. Second Edition. John Wiley and Sons, New York, Chapter 6.

See Also

ehyper, Hypergeometric, estimate.object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
  # Generate an observation from a hypergeometric distribution with 
  # parameters m=10, n=30, and k=5, then estimate the parameter m, and
  # the 80'th percentile. 
  # Note: the call to set.seed simply allows you to reproduce this example. 
  # Also, the only parameter actually estimated is m; once m is estimated, 
  # n is computed by subtracting the estimated value of m (8 in this example) 
  # from the given of value of m+n (40 in this example).  The parameters 
  # n and k are shown in the output in order to provide information on 
  # all of the parameters associated with the hypergeometric distribution.

  set.seed(250) 
  dat <- rhyper(nn = 1, m = 10, n = 30, k = 5) 
  dat 
  #[1] 1   

  eqhyper(dat, total = 40, k = 5, p = 0.8) 

  #Results of Distribution Parameter Estimation
  #--------------------------------------------
  #
  #Assumed Distribution:            Hypergeometric
  #
  #Estimated Parameter(s):          m =  8
  #                                 n = 32
  #                                 k =  5
  #
  #Estimation Method:               mle for 'm'
  #
  #Estimated Quantile(s):           80'th %ile = 2
  #
  #Quantile Estimation Method:      Quantile(s) Based on
  #                                 mle for 'm' Estimators
  #
  #Data:                            dat
  #
  #Sample Size:                     1

  #----------

  # Clean up
  #---------
  rm(dat)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.