# expvar: Expected variance In optimStrat: Choosing the Sample Strategy

## Description

Compute the expected variance of five sampling strategies.

## Usage

 ```1 2``` ```expvar(b, d, x, n, H, Rxy, stratum1 = NULL, stratum2 = NULL, st = 1:5, short = FALSE) ```

## Arguments

 `b` a numeric vector of length two giving the true shapes of the trend and spread terms. `d` a numeric vector of length two giving the assumed shapes of the trend and spread terms. `x` a positive numeric vector giving the values of the auxiliary variable. `n` a positive integer indicating the desired sample size. `H` a positive integer giving the desired number of strata/poststrata. Ignored if `stratum1` and `stratum2` are given. `Rxy` a number giving the correlation between the auxiliary variable and the study variable. `stratum1` a list giving stratum and sample sizes per stratum (see ‘Details’). `stratum2` a list giving stratum and sample sizes per stratum (see ‘Details’). `st` a numeric vector indicating the strategies for which the expected variance is to be calculated (see ‘Details’). `short` logical. If `FALSE` (the default) a vector of length five is returned. If `TRUE` only the strategies given by `st` are returned.

## Details

The expected variance of a sample of size `n` is computed for five sampling strategies (πps–reg, STSI–reg, STSI–HT, πps–pos and STSI–pos).

The strategies are defined assuming that the underlying superpopulation model is of the form

Y_k = δ_0 + δ_1 x_k^δ_2 + ε_k

with Eε_k = 0, Vε_k = δ_3^2 x_k^2δ_4 and Cov(ε_k , ε_l) = 0. But the true generating model is of the form

Y_k = β_0 + β_1 x_k^β_2 + ε_k

with Eε_k = 0, Vε_k = β_3^2 x_k^2β_4 and Cov(ε_k , ε_l) = 0.

The parameters β_2 and β_4 are given by `b`. The parameters δ_2 and δ_4 are given by `d`.

`stratum1` and `stratum2` are lists with two components (each with length `length(x)`): `stratum` indicates the stratum to which each element belongs and `nh` indicates the sample sizes to be selected in each stratum. They can be created via `optiallo`. `stratum1` gives the stratification for STSI–HT and the poststrata for πps–pos and STSI–pos; whereas `stratum2` gives the stratification for STSI–reg and STSI–pos. If `NULL`, `optiallo` is used for defining `H` strata/poststrata.

`st` indicates which variances to be calculated. If `1 in st`, the expected variance of πps–reg is calculated. If `2 in st`, the expected variance of STSI–reg is calculated, and so on.

## Value

If `short=FALSE` a vector of length five is returned giving the expected variance of the strategies given in `st`. `NA` is returned for those strategies not given in `st`. If `short=TRUE`, the `NA`s are omitted.

## References

Bueno, E. (2018). A Comparison of Stratified Simple Random Sampling and Probability Proportional-to-size Sampling. Research Report, Department of Statistics, Stockholm University 2018:6. http://gauss.stat.su.se/rr/RR2018_6.pdf.

`optiallo` for how to stratify an auxiliary variable and allocate the sample size; `desvar` for calculating the variance of the five strategies.
 ```1 2 3 4 5 6 7 8 9``` ```x<- 1 + sort( rgamma(5000, shape=4/9, scale=108) ) expvar(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9) expvar(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9,st=1:3) expvar(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9,st=1:3,short=TRUE) st1<- optiallo(n=500,x,H=6) post1<- optiallo(n=500,x^1.5,H=10) expvar(b=c(1,1),d=c(1,1),x,n=500,H=6,Rxy=0.9, stratum1=post1,stratum2=st1) ```