Description Usage Arguments Details Value Author(s) See Also Examples
Adjust sampling weights to given totals based on householdlevel and/or individual level constraints.
1 2 3 4 5 
dat 
a 
hid 
name of the column containing the householdids
within 
conP 
list or (partly) named list defining the constraints on person
level. The list elements are contingency tables in array representation
with dimnames corresponding to the names of the relevant calibration
variables in 
conH 
list or (partly) named list defining the constraints on
household level. The list elements are contingency tables in array
representation with dimnames corresponding to the names of the relevant
calibration variables in 
epsP 
numeric value or list (of numeric values and/or arrays)
specifying the convergence limit(s) for 
epsH 
numeric value or list (of numeric values and/or arrays)
specifying the convergence limit(s) for 
verbose 
if TRUE, some progress information will be printed. 
w 
name if the column containing the base
weights within 
bound 
numeric value specifying the multiplier for determining the
weight trimming boundary if the change of the base weights should be
restricted, i.e. if the weights should stay between 1/ 
maxIter 
numeric value specifying the maximum number of iterations that should be performed. 
meanHH 
if TRUE, every person in a household is assigned the mean of
the person weights corresponding to the household. If 
allPthenH 
if TRUE, all the person level calibration steps are performed before the houshold level calibration steps (and 
returnNA 
if TRUE, the calibrated weight will be set to NA in case of no convergence. 
looseH 
if FALSE, the actual constraints 
numericalWeighting 
If NULL computeLinear from the pacakge survey sd will be used. 
check_hh_vars 
If 
conversion_messages 
show a message, if inputs need to be reformatted. This can be useful for speed optimizations if ipu2 is called several times with similar inputs (for example bootstrapping) 
This function implements the weighting procedure described here.
conP
and conH
are contingency tables, which can be created with xtabs
. The dimnames
of those
tables should match the names and levels of the corresponding columns in dat
.
maxIter
, epsP
and epsH
are the stopping criteria. epsP
and epsH
describe relative tolerances
in the sense that
\deqn{1epsP < \frac{w_{i+1}}{w_i} < 1+epsP}{1epsP < w(i+1)/w(i) < 1+epsP}
will be used as convergence criterium. Here i is the iteration step and wi is the weight of a
specific person at step i.
The algorithm
performs best if all varables occuring in the constraints (conP
and conH
) as well as the
household variable are coded as factor
columns in dat
. Otherwise, conversions will be necessary
which can be monitored with the conversion_messages
argument.
Setting check_hh_vars
to FALSE
can also incease the performance of the scheme.
The function will return the input data dat
with the
calibrated weights calibWeight
as an additional column as well as attributes. If no convergence has been reached in maxIter
steps, and returnNA
is TRUE
(the default), the column calibWeights
will only consist of NA
s. The attributes of the table are
attributes derived from the data.table
class as well as the following.
converged  Did the algorithm converge in maxIter steps? 
iterations  The number of iterations performed. 
conP , conH , epsP , epsH  See Arguments. 
conP_adj , conH_adj  Adjusted versions of conP and conH 
formP , formH  Formulas that were used to calculate conP_adj and conH_adj based on the output table.

Alexander Kowarik, Gregor de Cillia
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62  data(eusilcS)
setDT(eusilcS)
eusilcS < eusilcS[, list(db030,hsize,db040,age,rb090,netIncome,db090,rb050)]
## rename columns
setnames(eusilcS, "rb090", "gender")
setnames(eusilcS, "db040", "state")
setnames(eusilcS, "db030", "household")
setnames(eusilcS, "rb050", "weight")
## some recoding
# generate age groups
eusilcS[, agegroup := cut(age, c(Inf, 10*1:9, Inf), right = FALSE)]
# some recoding of netIncome for reasons of simplicity
eusilcS[is.na(netIncome), netIncome := 0]
eusilcS[netIncome < 0, netIncome := 0]
# set hsize to 1,...,5+
eusilcS[, hsize := cut(hsize, c(0:4, Inf), labels = c(1:4, "5+"))]
# treat households as a factor variable
eusilcS[, household := as.factor(household)]
## example for base weights assuming a simple random sample of households stratified per region
eusilcS[, regSamp := .N, by = state]
eusilcS[, regPop := sum(weight), by = state]
eusilcS[, baseWeight := regPop/regSamp]
## constraints on person level
# age
conP1 < xtabs(weight ~ agegroup, data = eusilcS)
# gender by region
conP2 < xtabs(weight ~ gender + state, data = eusilcS)
# personal net income by gender
conP3 < xtabs(weight*netIncome ~ gender, data = eusilcS)
## constraints on household level
conH1 < xtabs(weight ~ hsize + state, data = eusilcS, subset = !duplicated(household))
# array of convergence limits for conH1
epsH1 < conH1
epsH1[1:4,] < 0.005
epsH1["5+",] < 0.2
# without array epsH1
calibweights1 < ipu2(eusilcS, hid = "household",
conP = list(conP1, conP2, netIncome = conP3),
conH = list(conH1),
epsP = list(1e06, 1e06, 1e03),
epsH = 0.01,
bound = NULL, verbose = TRUE, maxIter = 200)
# with array epsH1, base weights and bound
calibweights2 < ipu2(eusilcS, hid = "household",
conP = list(conP1, conP2),
conH = list(conH1),
epsP = 1e06,
epsH = list(epsH1),
w = "baseWeight",
bound = 4, verbose = TRUE, maxIter = 200)
# show an adjusted version of conP and the original
attr(calibweights2, "conP_adj")
attr(calibweights2, "conP")

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.