BW2stagePPSe: Estimated relvariance components for 2-stage sample

View source: R/BW2stagePPSe.R

BW2stagePPSeR Documentation

Estimated relvariance components for 2-stage sample


Estimate components of relvariance for a sample design where primary sampling units (PSUs) are selected with pps and elements are selected via srs. The input is a sample selected in this way.


BW2stagePPSe(Ni, ni, X, psuID, w, m, pp, lonely.SSU = "mean")



vector of number of elements in the population of each sample PSU; length is the number of PSUs in the sample.


vector of number of sample elements in each sample PSU; length is the number of PSUs in the sample. PSUs must be in the same order in ni and in X.


data vector for sample elements; length is the number of elements in the sample. These must be in PSU order. PSUs must be in the same order in ni and in X.


vector of PSU identification numbers. This vector must be as long as X. Each element in a given PSU should have the same value in psuID.


vector of full sample weights. This vector must be as long as X. Vector must be in the same order as X.


number of sample PSUs


vector of 1-draw probabilities for the PSUs. The length of this vector is the number of PSUs in the sample. Vector must be in the same order as Ni and ni.


indicator for how singleton SSUs should be handled when computing the within PSU unit relvariance. Allowable values are "mean" and "zero".


BW2stagePPSe computes the between and within population variance and relvariance components appropriate for a two-stage sample in which PSUs are selected with varying probabilities and with replacement. Elements within PSUs are selected by simple random sampling. The number of elements selected within each sample PSU can vary but must be at least two. The estimated components are appropriate for approximating the relvariance of the pwr-estimator of a total when the same number of elements are selected within each sample PSU. This function can also be used if PSUs are selected by srswr by appropriate definition of pp.

If a PSU contains multiple SSUs, some of which have missing data, or contains only one SSU, a value is imputed. If lonely.SSU = "mean", the mean of the non-missing PSU contributions is imputed. If lonely.SSU = "zero", a 0 is imputed. The former would be appropriate if a PSU contains multiple SSUs but one or more of them has missing data in which case R will normally calculate an NA. The latter would be appropriate if the PSU contains only one SSU which would be selected with certainty in any sample.


List with values:


estimated between PSU unit variance


estimated within PSU unit variance


estimated between PSU unit relvariance


estimated within PSU unit relvariance


estimated ratio of B+W to estimated unit relvariance of the analysis variable


intraclass correlation estimated as B/(B+W)


Richard Valliant, Jill A. Dever, Frauke Kreuter


Cochran, W.G. (1977, pp.308-310). Sampling Techniques. New York: John Wiley & Sons.

Valliant, R., Dever, J., Kreuter, F. (2018, sect. 9.4.1). Practical Tools for Designing and Weighting Survey Samples, 2nd edition. New York: Springer.

See Also

BW2stagePPS, BW2stageSRS, BW3stagePPS, BW3stagePPSe


## Not run: 
require(plyr)      # has function that allows renaming variables
Ni <- table(MDarea.pop$TRACT)
m <- 20
probi <- m*Ni / sum(Ni)
    # select sample of clusters
sam <- cluster(data=MDarea.pop, clustername="TRACT", size=m, method="systematic",
                pik=probi, description=TRUE)
    # extract data for the sample clusters
samclus <- getdata(MDarea.pop, sam)
samclus <- rename(samclus, c(Prob = "pi1"))

    # treat sample clusters as strata and select srswor from each
s <- strata(data =, stratanames = "TRACT",
            size = rep(50,m), method="srswor")
# extracts the observed data
samdat <- getdata(samclus,s)
samdat <- rename(samdat, c(Prob = "pi2"))

    # extract pop counts for PSUs in sample
pick <- names(Ni) %in% sort(unique(samdat$TRACT))
Ni.sam <- Ni[pick]
pp <- Ni.sam / sum(Ni)
wt <- 1/samdat$pi1/samdat$pi2

BW2stagePPSe(Ni = Ni.sam, ni = rep(50,20), X = samdat$y1,
            psuID = samdat$TRACT, w = wt,
            m = 20, pp = pp, lonely.SSU="mean")

## End(Not run)

PracTools documentation built on Aug. 17, 2022, 5:06 p.m.