vargreg: Design variance of the general regression estimator.

View source: R/vargreg.R

vargregR Documentation

Design variance of the general regression estimator.

Description

Compute the (approximated) design variance of the general regression estimator of the total of a study variable under different sampling designs.

Usage

vargreg(formula, design = NULL, n, stratum = NULL, 
        x_des = NULL, inc.p = NULL, ...)

Arguments

formula

an object of class formula: a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.

design

a character string giving the sampling design. It must be one of 'srs' (simple random sampling without replacement), 'poi' (Poisson sampling), 'stsi' (stratified simple random sampling), 'pips' (Pareto \pips sampling) or NULL (see ‘Details’).

n

either a positive number indicating the (expected) sample size (when design is one of 'srs', 'poi', 'pips' or NULL) or a numeric vector indicating the sample size of the strata to which each element belongs (when design is 'stsi') (see ‘Examples’).

stratum

a vector indicating the stratum to which every unit belongs. Only used if design is 'stsi'.

x_des

a positive numeric vector giving the values of the auxiliary variable that is used for defining the inclusion probabilities. Only used if design is 'poi' or 'pips'.

inc.p

a matrix giving the first and second order inclusion probabilities. Only used if design is NULL.

...

other arguments passed to lm (see ‘Details’).

Details

The formula should be of the form y~x, where y is the study variable and x are the auxiliary variables used by the general regression (GREG) estimator, \hat{t},. See formula for more details and ‘Examples’ for typical expressions for some well-known estimators (e.g. the Horvitz-Thompson, ratio, regression and poststratification estimators).

The variance of the GREG estimator is approximated by

AV\left(\hat{t}\right) = \sum_{k=1}^{N}\sum_{l=1}^{N}\pi_{kl}\frac{E_{k}}{\pi_{k}}\frac{E_{l}}{\pi_{l}} - \left(\sum_{k=1}^{N}E_{k}\right)^{2}

where

E_{k} = y_{k}-\hat{y}_{k} \textrm{ and } \hat{y}_{k} = x_{k}B \textrm{ with } B = \left(\sum_{k=1}^{N}w_{k}x_{k}^{'}x_{k}\right)\sum_{k=1}^{N}w_{k}x_{k}^{'}y_{k}

N is the population size and \pi_{k} and \pi_{kl} are, respectively, the first and second order inclusion probabilities. w_{k} is a weight associated to each element and it represents the inverse of the conditional variance (up to a scalar) of the underlying superpopulation model (see ‘Examples’).

If design=NULL, the matrix of inclusion probabilities is obtained proportional to the matrix p.inc. If design is other than NULL, the formula for the variance is simplified in such a way that the inclusion probabilities matrix is no longer necessary. In particular:

  • if design='srs', only the sample size n is required;

  • if design='stsi', both the stratum ID stratum and the sample size per stratum n, are required;

  • if design is either 'pips' or 'poi', the inclusion probabilities are obtained proportional to the values of x_des, corrected if necessary.

Value

A numeric value giving the variance of the general regression estimator under the desired design.

References

Sarndal, C.E., Swensson, B. and Wretman, J. (1992). Model Assisted Survey Sampling. Springer.

Rosen, B. (1997). On Sampling with Probability Proportional to Size. Journal of Statistical Planning and Inference 62, 159-191.

See Also

desvar for the simultaneous calculation of the variance of six sampling strategies; expgreg for the expected variance of the GREG estimator under a superpopulation model; expvar for the simultaneous calculation of the expected variance of five sampling strategies under a superpopulation model; optimApp for an interactive application of expgreg.

Examples

f<- function(x,b0,b1,b2,...) {b0+b1*x^b2}
g<- function(x,b3,...) {x^b3}
x<- 1 + sort( rgamma(5000, shape=4/9, scale=108) )
y<- simulatey(x,f,g,dist="gamma",b0=10,b1=1,b2=1,b3=1,rho=0.95)

st1<- optiallo(n=100,x=x,H=6)
vargreg("y~0",design="srs",n=100)                         #SRS-HT
vargreg("y~0",design="poi",n=100,x_des=x)                 #Poi-HT
vargreg("y~0",design="stsi",n=st1$nh,stratum=st1$stratum) #STSI-HT
vargreg("y~0",design="pips",n=100,x_des=x)                #PIPS-HT

vargreg("y~x-1",design="srs",n=100,weights=1/x)          #SRS-ratio
vargreg("y~x-1",design="poi",n=100,x_des=x,weights=1/x)  #Poi-ratio
vargreg("y~x-1",design="stsi",n=st1$nh,
        stratum=st1$stratum,weights=1/x)                 #STSI-ratio
vargreg("y~x-1",design="pips",n=100,x_des=x,weights=1/x) #PIPS-ratio

vargreg("y~x",design="srs",n=100)                         #SRS-reg
vargreg("y~x",design="poi",n=100,x_des=x)                 #Poi-reg
vargreg("y~x",design="stsi",n=st1$nh,stratum=st1$stratum) #STSI-reg
vargreg("y~x",design="pips",n=100,x_des=x)                #PIPS-reg

x2<- as.factor(st1$stratum)
vargreg("y~x2",design="srs",n=100)                          #SRS-pos
vargreg("y~x2",design="poi",n=100,x_des=x)                  #Poi-pos
vargreg("y~x2",design="stsi",n=st1$nh,stratum=st1$stratum)  #STSI-pos
vargreg("y~x2",design="pips",n=100,x_des=x)                 #PIPS-pos

y2<- c(16,21,18)
x2<- y2
inc.probs<- matrix(c(8,5,4,5,7,3,4,3,6),3,3)
vargreg("y2~0",n=2.1,inc.p=inc.probs)                 #HT
vargreg("y2~x2-1",n=2.1,inc.p=inc.probs,weights=1/x2) #Ratio
vargreg("y2~x2",n=2.1,inc.p=inc.probs)                #Regression
x3<- as.factor(c(1,2,2))
vargreg("y2~x3",n=2.1,inc.p=inc.probs)                #Post.

optimStrat documentation built on Aug. 24, 2023, 9:09 a.m.