BW3stagePPSe  R Documentation 
Estimate components of relvariance for a sample design where primary sampling units (PSUs) are selected with probability proportional to size with replacement (ppswr) and secondary sampling units (SSUs) and elements within SSUs are selected via simple random sampling (srs). The input is a sample selected in this way.
BW3stagePPSe(dat, v, Ni, Qi, Qij, m, lonely.SSU = "mean", lonely.TSU = "mean")
dat 
data frame for sample elements with PSU and SSU identifiers, weights, and analysis variable(s). The data frame should be sorted in hierarchical order: by PSU and SSU within PSU.
Required names for columns:

v 
Name or number of column in data frame 
Ni 

Qi 

Qij 
vector of numbers of elements in the population in the sample SSUs 
m 
number of sample PSUs 
lonely.SSU 
indicator for how singleton SSUs should be handled when computing the within PSU unit relvariance. Allowable values are 
lonely.TSU 
indicator for how singleton thirdstage units (TSUs) should be handled when computing the within SSU unit relvariance. Allowable values are 
BW3stagePPSe
computes the between and within population relvariance components appropriate
for a threestage sample in which PSUs are selected with varying probabilities and with replacement.
SSUs and elements within SSUs are selected by simple random sampling.
The estimated components are appropriate for approximating the relvariance of the
pwrestimator of a total when the same number of SSUs are selected within each PSU,
and the same number of elements are selected within each sample SSU.
If a PSU contains multiple SSUs, some of which have missing data, or contains only one SSU, a value is imputed. If lonely.SSU = "mean"
, the mean of the nonmissing PSU contributions is imputed. If lonely.SSU = "zero"
, a 0 is imputed. The former would be appropriate if a PSU contains multiple SSUs but one or more of them has missing data in which case R will normally calculate an NA. The latter would be appropriate if the PSU contains only one SSU which would be selected with certainty in any sample. lonely.TSU
has a similar purpose for thirdstage units.
List with values:
Vpsu 
estimated between PSU unit variance 
Vssu 
estimated secondstage unit variance among SSU totals 
Vtsu 
estimated thirdstage unit variance 
B 
estimated between PSU unit relvariance 
W 
estimated within PSU unit relvariance computed as if the sample were twostage 
k1 
estimated ratio of 
W2 
estimated unit relvariance among SSU totals 
W3 
estimated thirdstage unit relvariance among elements within PSU/SSUs 
k2 
estimated ratio of 
delta1 
homogeneity measure among elements within PSUs estimated as 
delta2 
homogeneity measure among elements within SSUs estimated as 
Richard Valliant, Jill A. Dever, Frauke Kreuter
Hansen, M.H., Hurwitz, W.N., and Madow, W.G. (1953, chap. 9, sect. 10). Sample Survey Methods and Theory, Vol.II. New York: John Wiley & Sons.
Valliant, R., Dever, J., Kreuter, F. (2018, sect. 9.4.2). Practical Tools for Designing and Weighting Survey Samples, 2nd edition. New York: Springer.
BW2stagePPS
, BW2stagePPSe
, BW2stageSRS
, BW3stagePPS
## Not run:
# select 3stage sample from Maryland population
set.seed(780087528)
data(MDarea.pop)
MDpop < MDarea.pop
require(sampling)
require(plyr) # has function that allows renaming variables
# make counts of SSUs and elements per PSU
xx < do.call("rbind",list(by(1:nrow(MDpop),MDpop$SSU,head,1)))
pop.tmp < MDpop[xx,]
Ni < table(pop.tmp$PSU)
Qi < table(MDarea.pop$PSU)
Qij < table(MDpop$SSU)
m < 30 # no. of PSUs to select
probi < m*Qi / sum(Qi)
# select sample of clusters
sam < cluster(data=MDpop, clustername="PSU", size=m, method="systematic",
pik=probi, description=TRUE)
# extract data for the sample clusters
samclus < getdata(MDpop, sam)
samclus < rename(samclus, c(Prob = "p1i"))
samclus < samclus[order(samclus$PSU),]
# treat sample clusters as strata and select srswor of block groups from each
# identify psu IDs for 1st instance of each ssuID
xx < do.call("rbind",list(by(1:nrow(samclus),samclus$SSU,head,1)))
SSUs < cbind(PSU=samclus$PSU[xx], SSU=samclus$SSU[xx])
# select 2 SSUs per tract
n < 2
s < strata(data = as.data.frame(SSUs), stratanames = "PSU",
size = rep(n,m), method="srswor")
s < rename(s, c(Prob = "p2i"))
# extract the SSU data
# s contains selection probs of SSUs, need to get those onto data file
SSUsam < SSUs[s$ID_unit, ]
SSUsam < cbind(SSUsam, s[, 2:3])
# identify rows in PSU sample that correspond to sample SSUs
tmp < samclus$SSU %in% SSUsam$SSU
SSUdat < samclus[tmp,]
SSUdat < merge(SSUdat, SSUsam[, c("p2i","SSU")], by="SSU")
# select srswor from each sample SSU
n.SSU < m*n
s < strata(data = as.data.frame(SSUdat), stratanames = "SSU",
size = rep(50,n.SSU), method="srswor")
s < rename(s, c(Prob = "p3i"))
samclus < getdata(SSUdat, s)
del < (1:ncol(samclus))[dimnames(samclus)[[2]] %in% c("ID_unit","Stratum")]
samclus < samclus[, del]
# extract pop counts for PSUs in sample
pick < names(Qi) %in% sort(unique(samclus$PSU))
Qi.sam < Qi[pick]
# extract pop counts of SSUs for PSUs in sample
pick < names(Ni) %in% sort(unique(samclus$PSU))
Ni.sam < Ni[pick]
# extract pop counts for SSUs in sample
pick < names(Qij) %in% sort(unique(samclus$SSU))
Qij.sam < Qij[pick]
# compute full sample weight and wts for PSUs and SSUs
wt < 1 / samclus$p1i / samclus$p2i / samclus$p3i
w1i < 1 / samclus$p1i
w2ij < 1 / samclus$p1i / samclus$p2i
samdat < data.frame(psuID = samclus$PSU, ssuID = samclus$SSU,
w1i = w1i, w2ij = w2ij, w = wt,
samclus[, c("y1","y2","y3","ins.cov", "hosp.stay")])
BW3stagePPSe(dat=samdat, v="y1", Ni=Ni.sam, Qi=Qi.sam, Qij=Qij.sam, m,
lonely.SSU = "mean", lonely.TSU = "mean")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.