UWE | R Documentation |
Computes the Unequal Weighting Effect for the current and initial weights of a design object.
UWE(design, by = NULL)
design |
Object of class |
by |
Formula specifying variables that define "estimation domains". If |
Function UWE
computes the Unequal Weighting Effect for the current (w
) and initial (w0
) weights of a design object, plus the corresponding variance inflation (or deflation) factor (UWE(w) / UWE(w0)
) induced by changing the weights from w0
to w
(w0 -> w
).
Following Kish's definition [Kish 92], the UWE is calculated as 1 plus the relative sample variance of the weights: UWE(w) = 1 + RelVar(w)
.
The current weights, w
, of design
are the weights that would be returned by weights(design)
and would be used for estimation purposes by functions svystatTM
, svystatR
, etc.
The initial weights, w0
, of design
depend on the nature of object design
:
If design
is the outcome of a ‘weight-changing pipeline’, w0 -> w1 -> ... -> w
, i.e. it was obtained by the application of an arbitrary chain of ReGenesees functions that modify the weights (e.g. smooth.strat.jump
, e.calibrate
, ext.calibrated
, trimcal
, ...), then the initial weights, w0
, are the weights of the starting design object in the pipeline.
If design
is an initial design object generated by function e.svydesign
, then the initial weights, w0
, are taken as equal to current weights, w0 = w
.
Note that, when design
is the outcome of a ‘weight-changing pipeline’, function UWE
provides a measure of the overall, cumulative impact of all the adjustments the weights underwent throughout the pipeline.
To assess the effect, in terms of UWE and variance inflation, of just a single processing step of the pipeline, you can call function UWE
on the input and output designs of that step and compare the results (basically, by taking suitable ratios).
A data.frame, with one single row (if by = NULL
) or one row for each domain (if by
is passed), and the following columns:
Column Meaning UWE.curr......Current Unequal Weighting Effect UWE.ini.......Initial Unequal Weighting Effect VAR.infl......Variance Inflation Factor ( UWE.curr / UWE.ini )
Kish's UWE is a model-based tool that can be useful for diagnostic purposes. However, its values must be interpreted with some caution, exactly as it is necessary to do for model-based estimates of Kish's Deff.
In particular, UWE is - by construction - only sensitive to variations of the sample variance of the weights. Therefore, it is unable to discriminate weight adjustments which, despite adding variability to the weights at sample level, might result in reductions of the sampling variance for some estimators. This is often the case of calibration, which may well make survey weights more unequal, but nonetheless cause their reciprocals to become more correlated to some interest variables. Similar considerations hold for stratified sampling, to the extent that, with respect to the interest variables, units tend to be more similar within strata than between strata.
In any case, the UWE can turn out handy when comparing the potential outcomes of performing the same kind of weight adjustment under slightly different settings (e.g. calibration with different bounds or distance functions, trimming with different thresholds, etc.).
Diego Zardetto
Kish, L. (1992). Weighting for unequal Pi. Journal of Official Statistics, 8, 183-200.
ReGenesees functions which define survey weights (e.svydesign
), or modify survey weights (e.g. smooth.strat.jump
, e.calibrate
, ext.calibrated
, trimcal
, ...).
###############################################
# Compute the UWE along the following example #
# of weight-changing pipeline: #
# 1) Smooth for stratum jumpers #
# 2) Adjust for nonresponse #
# 3) Calibrate to known population totals #
# 4) Consistently trim calibration weights #
# #
# NOTE: To perform 1) and 2) I will first #
# A) simulate some stratum jumpers. #
# B) simulate some nonresponse. #
###############################################
## Load sbs data:
data(sbs)
## -- A) Simulate stratum jumpers
# Create the strata variable observed at survey time by cloning the
# strata variable at sampling time
sbs$curr.strata <- sbs$strata
# Now inject some (say ~250) random stratum jumpers
set.seed(12345) # (fix the RNG seed for reproducibility)
sbs$curr.strata[sample(1:nrow(sbs), 250)] <- sbs$curr.strata[sample(1:nrow(sbs), 250)]
# Resulting number of stratum jumpers:
tt <- table(sbs$strata, sbs$curr.strata)
sum(tt[row(tt) != col(tt)])
## -- B) Simulate nonresponse
# Assume a response propensity that increases with enterprise size (as
# measured by number of employees)
levels(sbs$emp.cl)
p.resp <- c(.4, .6, .8, .95, .99)
# Tie response probabilities to sample observations:
pr <- p.resp[unclass(sbs$emp.cl)]
# Now, randomly select a subsample of responding units from sbs:
set.seed(12345) # (fix the RNG seed for reproducibility)
rand <- runif(1:nrow(sbs))
sbs.r <- sbs[rand < pr, ]
# This implies an overall response rate of about 73%:
nrow(sbs.r) / nrow(sbs)
## -- 0) Create the respondent design object
# NOTE: I'll keep using the original fpc column for the sake of the examples,
# but they should be recomputed in real applications...
sbsdes<-e.svydesign(data=sbs.r,ids=~id,strata=~strata,weights=~weight,fpc=~fpc)
## -- 1) Smooth for stratum jumpers
# Use method 'MinChange'
sbssmooth <- smooth.strat.jump(sbsdes, ~curr.strata)
# Have a look
sbssmooth
## -- 2) Adjust for nonresponse
# Use a simple Response Homogeneity Model approach, with size classes
# as RHGs. Perform the RHG weight adjustment via calibration
# Compute enterprise counts by size classes from the frame
N.RHG <- pop.template(sbssmooth, calmodel= ~emp.cl - 1)
N.RHG <- fill.template(sbs.frame, N.RHG)
# Calibrate to achieve the RHG adjustment
sbsRHG <- e.calibrate(sbssmooth, N.RHG)
# Have a look
sbsRHG
# -- 3) Calibrate to known population totals
# Now calibrate again in order to reduce estimators variance, by using further
# available auxiliary information, e.g. the total number of employees (emp.num)
# and enterprises (ent) inside the domains obtained by crossing nace.macro
# and region:
pop <- pop.template(sbsRHG, calmodel = ~emp.num + ent-1,
partition = ~nace.macro:region)
pop <- fill.template(sbs.frame, pop)
# Calibrate to improve estimation efficiency
sbscal <- e.calibrate(sbsRHG, pop)
# Have a look
sbscal
# -- 4) Consistently trim calibration weights
# Say one wants to avoid weights that are less then 1 and above 50:
sbstrim <- trimcal(sbscal, c(1, 50))
# Have a look
sbstrim
## -- UWE calculation along the weights-changing pipeline
# Object sbstrim is the output of the weights-changing pipeline, as
# one easily recognizes when printing it:
sbstrim
# UWE of initial object
UWE(sbsdes)
# UWE at step 1), i.e. smoothing for stratum jumpers
UWE(sbssmooth)
# UWE of step 2), i.e. nonresponse RHG adjustment
UWE(sbsRHG)
# UWE at step 3), i.e. calibration for efficiency improvement
UWE(sbscal)
# UWE at step 4), i.e. consistent trimming of calibration weights
UWE(sbstrim)
# End
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.