make_kernel_var_matrix: Make a quadratic form matrix for the kernel-based variance...

View source: R/quadratic_forms.R

make_kernel_var_matrixR Documentation

Make a quadratic form matrix for the kernel-based variance estimator of Breidt, Opsomer, and Sanchez-Borrego (2016)

Description

Constructs the quadratic form matrix for the kernel-based variance estimator of Breidt, Opsomer, and Sanchez-Borrego (2016). The bandwidth is automatically chosen to result in the smallest possible nonempty kernel window.

Usage

make_kernel_var_matrix(x, kernel = "Epanechnikov", bandwidth = "auto")

Arguments

x

A numeric vector, giving the values of an auxiliary variable.

kernel

The name of a kernel function. Currently only "Epanechnikov" is supported.

bandwidth

The bandwidth to use for the kernel. The default value is "auto", which means that the bandwidth will be chosen automatically to produce the smallest window size while ensuring that every unit has a nonempty window, as suggested by Breidt, Opsomer, and Sanchez-Borrego (2016). Otherwise, the user can supply their own value, which can be a single positive number.

Details

This kernel-based variance estimator was proposed by Breidt, Opsomer, and Sanchez-Borrego (2016), for use with samples selected using systematic sampling or where only a single sampling unit is selected from each stratum (sometimes referred to as "fine stratification").

Suppose there are n sampled units, and for each unit i there is a numeric population characteristic x_i and there is a weighted total \hat{Y}_i, where \hat{Y}_i is only observed in the selected sample but x_i is known prior to sampling.

The variance estimator has the following form:

\hat{V}_{ker}=\frac{1}{C_d} \sum_{i=1}^n (\hat{Y}_i-\sum_{j=1}^n d_j(i) \hat{Y}_j)^2

The terms d_j(i) are kernel weights given by

d_j(i)=\frac{K(\frac{x_i-x_j}{h})}{\sum_{j=1}^n K(\frac{x_i-x_j}{h})}

where K(\cdot) is a symmetric, bounded kernel function and h is a bandwidth parameter. The normalizing constant C_d is computed as:

C_d=\frac{1}{n} \sum_{i=1}^n(1-2 d_i(i)+\sum_{j=1}^H d_j^2(i))

If n=2, then the estimator is simply the estimator used for simple random sampling without replacement.

If n=1, then the matrix simply has an entry equal to 0.

Value

The quadratic form matrix for the variance estimator, with dimension equal to the length of x. The resulting object has an attribute bandwidth that can be retrieved using attr(Q, 'bandwidth')

References

Breidt, F. J., Opsomer, J. D., & Sanchez-Borrego, I. (2016). "Nonparametric Variance Estimation Under Fine Stratification: An Alternative to Collapsed Strata." Journal of the American Statistical Association, 111(514), 822–833. https://doi.org/10.1080/01621459.2015.1058264

Examples

# The auxiliary variable has the same value for all units
make_kernel_var_matrix(c(1, 1, 1))

# The auxiliary variable differs across units
make_kernel_var_matrix(c(1, 2, 3))

# View the bandwidth that was automatically selected
Q <- make_kernel_var_matrix(c(1, 2, 4))
attr(Q, 'bandwidth')


bschneidr/svrep documentation built on Feb. 11, 2025, 4:24 a.m.