View source: R/quadratic_forms.R
make_ppswor_approx_matrix | R Documentation |
Several variance estimators for designs that use unequal probability sampling without replacement (i.e., PPSWOR), variance estimation tends to be more accurate when using an approximation estimator that uses the first-order inclusion probabilities (i.e., the basic sampling weights) and ignores the joint inclusion probabilities. This function returns the matrix of the quadratic form used to represent such variance estimators.
make_ppswor_approx_matrix(probs, method = "Deville-1")
probs |
A vector of first-order inclusion probabilities |
method |
A string specifying the approximation method to use. See the "Details" section below. Options include:
|
A symmetric matrix whose dimension matches the length of probs
.
The "Deville-1" and "Deville-2" approximations have been shown to be effective for designs that use a fixed sample size with a high-entropy sampling method. This includes most PPSWOR sampling methods, but unequal-probability systematic sampling is an important exception.
Deville's variance estimators generally take the following form:
\hat{v}(\hat{Y}) = \sum_{i=1}^{n} c_i (\breve{y}_i - \frac{1}{\sum_{i=k}^{n}c_k}\sum_{k=1}^{n}c_k \breve{y}_k)^2
where \breve{y}_i = y_i/\pi_i
is the weighted value of the the variable of interest,
and c_i
are constants that depend on the approximation method used.
The matrix of the quadratic form, denoted \Sigma
, has
its ij
-th entry defined as follows:
\sigma_{ii} = c_i (1 - \frac{c_i}{\sum_{k=1}^{n}c_k}) \textit{ when } i = j \\
\sigma_{ij}=\frac{-c_i c_j}{\sum_{k=1}^{n}c_k} \textit{ when } i \neq j \\
When \pi_{i} = 1
for every unit, then \sigma_{ij}=0
for all i,j
.
If there is only one sampling unit, then \sigma_{11}=0
; that is, the unit is treated as if it was sampled with certainty.
The constants c_i
are defined for each approximation method as follows,
with the names taken directly from Matei and Tillé (2005).
"Deville-1":
c_i=\left(1-\pi_i\right) \frac{n}{n-1}
"Deville-2":
c_i = (1-\pi_i) \left[1 - \sum_{k=1}^{n} \left(\frac{1-\pi_k}{\sum_{k=1}^{n}(1-\pi_k)}\right)^2 \right]^{-1}
Both of the approximations "Deville-1" and "Deville-2" were shown in the simulation studies of Matei and Tillé (2005) to perform much better in terms of MSE compared to the strictly-unbiased Horvitz-Thompson and Yates-Grundy variance estimators. In the case of simple random sampling without replacement (SRSWOR), these estimators are identical to the usual Horvitz-Thompson variance estimator.
Beaumont and Emond (2022) proposed a variance estimator for unequal probability sampling without replacement. This estimator is simply the Horvitz-Thompson variance estimator with the following approximation for the joint inclusion probabilities.
\pi_{kl} \approx \pi_k \pi_l \frac{n - 1}{(n-1) + \sqrt{(1-\pi_k)(1-\pi_l)}}
In the case of cluster sampling, this approximation should be applied to the clusters rather than the units within clusters.
Matei, Alina, and Yves Tillé. 2005. “Evaluation of Variance Approximations and Estimators in Maximum Entropy Sampling with Unequal Probability and Fixed Sample Size.” Journal of Official Statistics 21(4):543–70.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.