BW2stagePPSe: Estimated relvariance components for 2-stage sample
In PracTools: Designing and Weighting Survey Samples

BW2stagePPSe

R Documentation

Estimated relvariance components for 2-stage sample

Description

Estimate components of relvariance for a sample design where primary sampling units (PSUs) are selected with pps and elements are selected via srs. The input is a sample selected in this way.

Usage

BW2stagePPSe(Ni, ni, X, psuID, w, m, pp, lonely.SSU = "mean")

Arguments

`Ni`	vector of number of elements in the population of each sample PSU; length is the number of PSUs in the sample.
`ni`	vector of number of sample elements in each sample PSU; length is the number of PSUs in the sample. PSUs must be in the same order in `ni` and in `X`.
`X`	data vector for sample elements; length is the number of elements in the sample. These must be in PSU order. PSUs must be in the same order in `ni` and in `X`.
`psuID`	vector of PSU identification numbers. This vector must be as long as `X`. Each element in a given PSU should have the same value in `psuID`.
`w`	vector of full sample weights. This vector must be as long as `X`. Vector must be in the same order as `X`.
`m`	number of sample PSUs
`pp`	vector of 1-draw probabilities for the PSUs. The length of this vector is the number of PSUs in the sample. Vector must be in the same order as `Ni` and `ni`.
`lonely.SSU`	indicator for how singleton SSUs should be handled when computing the within PSU unit relvariance. Allowable values are `"mean"` and `"zero"`.

Details

BW2stagePPSe computes the between and within population variance and relvariance components appropriate for a two-stage sample in which PSUs are selected with varying probabilities and with replacement. Elements within PSUs are selected by simple random sampling. The number of elements selected within each sample PSU can vary but must be at least two. The estimated components are appropriate for approximating the relvariance of the pwr-estimator of a total when the same number of elements are selected within each sample PSU. This function can also be used if PSUs are selected by srswr by appropriate definition of pp.

If a PSU contains multiple SSUs, some of which have missing data, or contains only one SSU, a value is imputed. If lonely.SSU = "mean", the mean of the non-missing PSU contributions is imputed. If lonely.SSU = "zero", a 0 is imputed. The former would be appropriate if a PSU contains multiple SSUs but one or more of them has missing data in which case R will normally calculate an NA. The latter would be appropriate if the PSU contains only one SSU which would be selected with certainty in any sample.

If any PSUs have one-draw probabilities of 1 (pp=1), the function will be halted. Any such PSUs should be removed before calling the function.

Value

List with values:

`Vpsu`	estimated between PSU unit variance
`Vssu`	estimated within PSU unit variance
`B`	estimated between PSU unit relvariance
`W`	estimated within PSU unit relvariance
`k`	estimated ratio of `B+W` to estimated unit relvariance of the analysis variable
`delta`	intraclass correlation estimated as `B/(B+W)`

Author(s)

Richard Valliant, Jill A. Dever, Frauke Kreuter

References

Cochran, W.G. (1977, pp.308-310). Sampling Techniques. New York: John Wiley & Sons.

Valliant, R., Dever, J., Kreuter, F. (2018, sect. 9.4.1). Practical Tools for Designing and Weighting Survey Samples, 2nd edition. New York: Springer.

Examples


require(sampling)
require(plyr)      # has function that allows renaming variables
data(MDarea.popA)
Ni <- table(MDarea.popA$TRACT)
m <- 20
probi <- m*Ni / sum(Ni)
    # select sample of clusters
sam <- cluster(data=MDarea.popA, clustername="TRACT", size=m, method="systematic",
                pik=probi, description=TRUE)
    # extract data for the sample clusters
samclus <- getdata(MDarea.popA, sam)
samclus <- rename(samclus, c("Prob" = "pi1"))


    # treat sample clusters as strata and select srswor from each
s <- strata(data = as.data.frame(samclus), stratanames = "TRACT",
            size = rep(50,m), method="srswor")
# extracts the observed data
samdat <- getdata(samclus,s)
samdat <- rename(samdat, c("Prob" = "pi2"))

    # extract pop counts for PSUs in sample
pick <- names(Ni) %in% sort(unique(samdat$TRACT))
Ni.sam <- Ni[pick]
pp <- Ni.sam / sum(Ni)
wt <- 1/samdat$pi1/samdat$pi2

BW2stagePPSe(Ni = Ni.sam, ni = rep(50,20), X = samdat$y1,
            psuID = samdat$TRACT, w = wt,
            m = 20, pp = pp, lonely.SSU="mean")

PracTools documentation built on June 8, 2025, 10:12 a.m.