Description Usage Arguments Details Value References See Also Examples
Compute the expected design variance of the general regression estimator of the total of a study variable under different sampling designs.
1 2 |
x |
design matrix with the variables to be used into the GREG estimator. |
b11 |
a numeric vector of length equal to the number of variables in |
b12 |
a numeric vector of length equal to the number of variables in |
b21 |
a numeric vector of length equal to the number of variables in |
b22 |
a numeric vector of length equal to the number of variables in |
d12 |
a numeric vector of length equal to the number of variables in |
Rfy |
a number giving the square root of the coefficient of determination between the auxiliary variables and the study varible. |
n |
either a positive number indicating the (expected) sample size (when |
design |
a character string giving the sampling design. It must be one of 'srs' (simple random sampling without replacement), 'poi' (Poisson sampling), 'stsi' (stratified simple random sampling), 'pips' (Pareto πps sampling) or |
stratum |
a vector indicating the stratum to which every unit belongs. Only used if |
x_des |
a positive numeric vector giving the values of the auxiliary variable that is used for defining the inclusion probabilities. Only used if |
inc.p |
a matrix giving the first and second order inclusion probabilities. Only used if |
... |
other arguments passed to |
The expected variance of the general regression estimator under different sampling designs is computed.
It is assumed that the underlying superpopulation model is of the form
Y_k = f(x_k|δ_1) + ε_k
with Eε_k = 0, Vε_k = σ_0^2 g(x_k|δ_2)^2 and Cov(ε_k , ε_l) = 0.
But the true generating model is in fact of the form
Y_k = f(x_k|β_1) + ε_k
with Eε_k = 0, Vε_k = σ^2 g(x_k|β_2)^2 and Cov(ε_k , ε_l) = 0.
Where
f(x_k|δ_1) = Σ_[j=1]^J δ_[1,j] x_[j,k]^δ_[1,J+j],
f(g_k|δ_2) = Σ_[j=1]^J δ_[2,j] x_[j,k]^δ_[2,J+j],
f(x_k|β_1) = Σ_[j=1]^J β_[1,j] x_[j,k]^β_[1,J+j],
f(g_k|β_2) = Σ_[j=1]^J β_[2,j] x_[j,k]^β_[2,J+j].
the coefficients β_[1,j] (j = 1,...,J) are given by b11
;
the exponents β_[1,j] (j=J+1,...,2J) are given by b12
;
the coefficients β_[2,j] (j = 1,...,J) are given by b21
;
the exponents β_[2,j] (j = J+1,...,2J) are given by b22
;
the exponents δ_[1,j] (j = J+1,...,2J) are given by d12
.
The expected variance of the GREG estimator is approximated by
E(V(t_hat)) = V(t*_hat) + σ*^2 Σ_[k=1]^N (1/π_k - 1)g(x_k|β_2)^2
where
V(t*_hat) = Σ_[k=1]^N Σ_[l=1]^N π_kl (z_k*z_l)/(π_k*π_l) - (Σ_[k=1]^N z_k)^2
and
σ*^2 = S^2_f/(g^2)_bar*(1/R_fy^2 - 1),
z_k = (x_k^β - x_k^δ*A)*β**_1,
S^2_f = Σ_[k=1]^N (f(x_k|β_1) - f_bar)^2 / N,
(g^2)_bar = Σ_[k=1]^N g(x_k|β_2)^2 / N,
x_k^β = (x_[1k]^(β_[1,J+1]),…,x_[Jk]^(β_[1,2J])),
x_k^δ = (x_[1k]^(δ_[1,J+1]),…,x_[Jk]^(δ_[1,2J])),
β**_1 = (β_[1,1],…,β_[1,J])',
A = (Σ_[k=1]^N w_k*x_k^δ'*x_k^δ)^-1 Σ_[k=1]^N w_k*x_k^δ'*x_k^β.
N is the population size and π_k and π_kl are, respectively, the first and second order inclusion probabilities. w_k is a weight associated to each element and it represents the inverse of the conditional variance (up to a scalar) of the underlying superpopulation model (see ‘Examples’).
If design=NULL
, the matrix of inclusion probabilities is obtained proportional to the matrix p.inc
. If design
is other than NULL
, the formula for the variance is simplified in such a way that the inclusion probabilities matrix is no longer necessary. In particular:
if design='srs'
, only the sample size n
is required;
if design='stsi'
, both the stratum ID stratum
and the sample size per stratum n
, are required;
if design
is either 'pips'
or 'poi'
, the inclusion probabilities are obtained proportional to the values of x_des
, corrected if necessary.
A numeric value giving the expected variance of the general regression estimator for the desired design under the working and true models.
Bueno, E. (2018). A Comparison of Stratified Simple Random Sampling and Probability Proportional-to-size Sampling. Research Report, Department of Statistics, Stockholm University 2018:6. http://gauss.stat.su.se/rr/RR2018_6.pdf.
expvar
for the simultaneous calculation of the expected variance of five sampling strategies under a superpopulation model; vargreg
for the variance of the GREG estimator; desvar
for the simultaneous calculation of the variance of six sampling strategies; optimApp
for an interactive application of expgreg
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | x1<- 1 + sort( rgamma(5000, shape=4/9, scale=108) )
x2<- 1 + sort( rgamma(5000, shape=4/9, scale=108) )
x3<- 1 + sort( rgamma(5000, shape=4/9, scale=108) )
x<- cbind(x1,x2,x3)
expgreg(x,b11=c(1,-1,0),b12=c(1,1,0),b21=c(0,0,1),b22=c(0,0,0.5),
d12=c(1,1,0),Rfy=0.8,n=150,"pips",x_des=x3)
expgreg(x,b11=c(1,-1,0),b12=c(1,1,0),b21=c(0,0,1),b22=c(0,0,0.5),
d12=c(1,1,0),Rfy=0.8,n=150,"pips",x_des=x2)
expgreg(x,b11=c(1,-1,0),b12=c(1,1,0),b21=c(0,0,1),b22=c(0,0,0.5),
d12=c(1,1,0),Rfy=0.8,n=150,"pips",x_des=x2,weights=1/x1)
st1<- optiallo(n=150,x=x3,H=6)
expgreg(x,b11=c(1,-1,0),b12=c(1,1,0),b21=c(0,0,1),b22=c(0,0,0.5),
d12=c(1,1,0),Rfy=0.8,n=st1$nh,"stsi",stratum=st1$stratum)
expgreg(x,b11=c(1,-1,0),b12=c(1,1,0),b21=c(0,0,1),b22=c(0,0,0.5),
d12=c(1,0,1),Rfy=0.8,n=st1$nh,"stsi",stratum=st1$stratum)
expgreg(x,b11=c(1,-1,0),b12=c(1,1,0),b21=c(0,0,1),b22=c(0,0,0.5),
d12=c(1,0,1),Rfy=0.8,n=st1$nh,"stsi",stratum=st1$stratum,weights=1/x1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.