Pkl_Hajek_U: The Hajek approximation for the 2nd order (joint) inclusion...
In samplingVarEst: Sampling Variance Estimation

Pkl.Hajek.U

R Documentation

The Hajek approximation for the 2nd order (joint) inclusion probabilities (population based)

Description

Computes the Hajek (1964) approximation for the 2nd order (joint) inclusion probabilities utilising population-based quantities.

Usage

Pkl.Hajek.U(VecPk.U)

Arguments

VecPk.U

vector of the first-order inclusion probabilities; its length is equal to the population size. Values in VecPk.U must be greater than zero and less than or equal to one. There must not be missing values.

Details

Let π_k denote the inclusion probability of the k-th element in the sample s, and let π_{kl} denote the joint-inclusion probabilities of the k-th and l-th elements in the sample s. If the joint-inclusion probabilities π_{kl} are not available, the Hajek (1964) approximation can be used. Note that this approximation is designed for large-entropy sampling designs, large samples, and large populations, i.e., care should be taken with highly-stratified samples, e.g. Berger (2005).

The population-based version of the Hajek (1964) approximation for the joint-inclusion probabilities π_{kl} (implemented by the current function) is:

π_{kl} \doteq π_k π_l \{1 - d^{-1}(1-π_k)(1-π_l)\}

where d =∑_{k\in U}π_k(1-π_k).

The approximation was originally developed for d\rightarrow∞, under the maximum-entropy sampling design (see Hajek 1981, Theorem 3.3, Ch. 3 and 6), the Rejective Sampling design. It requires that the utilised sampling design is of large entropy. An overview can be found in Berger and Tille (2009). An account of different sampling designs, π_{kl} approximations, and approximate variances under large-entropy designs can be found in Tille (2006), Brewer and Donadio (2003), and Haziza, Mecatti, and Rao (2008). Recently, Berger (2011) gave sufficient conditions under which Hajek's results still hold for large-entropy sampling designs that are not the maximum-entropy one.

Value

The function returns a (N by N) square matrix with the estimated joint inclusion probabilities, where N is the population size.

Author(s)

Emilio Lopez Escobar.

References

Berger, Y. G. (2005) Variance estimation with highly stratified sampling designs with unequal probabilities. Australian & New Zealand Journal of Statistics, 47, 365–373.

Berger, Y. G. (2011) Asymptotic consistency under large entropy sampling designs with unequal probabilities. Pakistan Journal of Statististics, 27, 407–426.

Berger, Y. G. and Tille, Y. (2009) Sampling with unequal probabilities. In Sample Surveys: Design, Methods and Applications (eds. D. Pfeffermann and C. R. Rao), 39–54. Elsevier, Amsterdam.

Brewer, K. R. W. and Donadio, M. E. (2003) The large entropy variance of the Horvitz-Thompson estimator. Survey Methodology 29, 189–196.

Hajek, J. (1964) Asymptotic theory of rejective sampling with varying probabilities from a finite population. The Annals of Mathematical Statistics, 35, 4, 1491–1523.

Hajek, J. (1981) Sampling From a Finite Population. Dekker, New York.

Haziza, D., Mecatti, F. and Rao, J. N. K. (2008) Evaluation of some approximate variance estimators under the Rao-Sampford unequal probability sampling design. Metron, LXVI, 91–108.

Tille, Y. (2006) Sampling Algorithms. Springer, New York.

Examples

data(oaxaca)                                 #Loads the Oaxaca municipalities dataset
pik.U  <- Pk.PropNorm.U(373, oaxaca$HOMES00) #Reconstructs the 1st order incl. probs.
#(This approximation is only suitable for large-entropy sampling designs)
pikl.U <- Pkl.Hajek.U(pik.U)                 #Approximates 2nd order incl. probs. from U
#First 5 rows/cols of (population-based) 2nd order incl. probs. matrix
pikl.U[1:5,1:5]

samplingVarEst documentation built on Jan. 14, 2023, 5:08 p.m.