View source: R/get.residuals.R
get.residuals | R Documentation |
Computes (scaled) residuals of a set of interest variables w.r.t. the calibration model adopted to build a calibrated object.
get.residuals(cal.design, y, scale = c("no", "w", "d", "g"))
cal.design |
Object of class |
y |
Formula defining the variables of interest. |
scale |
|
This function has been designed mainly for programmers or advanced users willing to build upon ReGenesees: typical users are not expected to feel much need of it.
The residuals of an interest variable w.r.t. the linear model defined by the auxiliary variables used for calibration play a central role in estimating the variance of Calibration Estimators (see, e.g., [Deville, 1999]). Notice that if object cal.design
has been generated by running a partitioned calibration task (see e.calibrate
), the residuals will be correctly computed using the different estimated regression coefficients pertaining to the different domains belonging to the partition.
The mandatory argument y
behaves exactly the same way as it does in function svystatTM
.
The scale
argument allows to scale the computed residuals by multiplying them by different factors. By default scale="no"
, that is unscaled residuals are returned. Value "w"
returns the residuals times the calibrated weights; value "d"
returns the residuals times the initial weights; finally, value "g"
returns the residuals times the g-weights (i.e. the ratios between calibrated and initial weights). Notice that the semantics of argument scale
are slightly modified when the input object cal.design
has been obtained by a multi-step calibration procedure (see Section 'Note' below).
A matrix of residuals.
If cal.design
has undergone k
subsequent calibration steps (with k >= 1
), the function will return the residuals computed w.r.t. the linear assisting model underlying the last (i.e. k
-th) calibration step.
If k >= 2
, the scale parameter will be interpreted as follows:
SCALE MEANING "no"........no scale; "w"........last calibration weight (i.e. at step k); "d"........second to last calibration weight (i.e. at step k - 1); "g"........ratio between last and second to last calibration weights.
Diego Zardetto
Deville, J. C. (1999) “Variance Estimation for Complex Statistics and Estimators: Linearization and Residual Techniques.”. Survey Methodology 25: 193-203.
weights
to extract the weights from a design object, e.calibrate
for calibrating weights and g.range
to get the range of the g-weights.
##################################################################
# Just some checks on the consistency of the numerical results #
# obtained by ReGenesees with well known theoretical properties. #
##################################################################
# Load sbs data:
data(sbs)
# Create a design object to be calibrated:
sbsdes<-e.svydesign(data=sbs,ids=~id,strata=~strata,weights=~weight,fpc=~fpc)
################
## Property 1 ##
################
## If the calibration model has (implicitly or explicitly) an intercept
## the weighted sum of residuals must be zero.
# Suppose you calibrate on enterprise counts inside areas, i.e. a calibration
# model WITH intercept (though implicitly):
# calmodel= ~ area - 1
# First, build and fill the known totals template:
pop<-pop.template(sbsdes, calmodel= ~ area - 1)
pop<-fill.template(pop, universe=sbs.frame)
# Then, calibrate:
sbscal<-e.calibrate(sbsdes, pop)
# Now, get the residuals of any variable (e.g. y and emp.num) scaled with the
# direct weights:
de <- get.residuals(sbscal, ~ y + emp.num, scale="d")
# Lastly, compute the column sums...
colSums(de)
#...which is (numerically) zero, as it must be.
################
## Property 2 ##
################
## If the calibration model does not have (implicitly or explicitly) an
## intercept term the weighted sum of residuals is generally different from zero.
# Suppose you calibrate on employees counts inside areas, i.e. a calibration
# model WITHOUT intercept:
# calmodel= ~ emp.num:area - 1
# First, build and fill the known totals template:
pop<-pop.template(sbsdes, calmodel= ~ emp.num:area - 1)
pop<-fill.template(pop, universe=sbs.frame)
# Then, calibrate:
sbscal<-e.calibrate(sbsdes, pop)
# Now, get the residuals of any variable (e.g. y and region) scaled with the
# direct weights:
de <- get.residuals(sbscal, ~ y + region, scale="d")
# Lastly, compute the column sums...
colSums(de)
#...which is far from zero, as expected.
################
## Property 3 ##
################
## In the Taylor linearization approach, estimating the variance of a
## Calibration Estimator amounts to estimating the variance of the HT total
## of its linearized variable (i.e. the g-scaled residual), under the sampling
## design at hand.
# Suppose you calibrate on the total number of employees and enterprises
# inside the domains obtained by:
# i) crossing nace.macro and region;
# ii) crossing emp.cl and region;
# First, build and fill the population totals template:
pop<-pop.template(sbsdes,
calmodel=~(emp.num+ent):(nace.macro+emp.cl)-1,
partition=~region)
pop<-fill.template(universe=sbs.frame,template=pop)
# Then, calibrate:
sbscal<-e.calibrate(sbsdes,pop)
# Now, compute the linearized variable of the Calibration Estimator of the
# total of any variable (e.g. va.imp2):
z_va.imp2 <- get.residuals(sbscal, ~ va.imp2, scale="g")
# Now, treat z_va.imp2 as an ordinary variable and compute the standard error
# of its HT total:
sbsdes<-des.addvars(sbsdes, z_va.imp2 = z_va.imp2)
SE(svystatTM(sbsdes, ~z_va.imp2))
# Lastly, compute directly the standard error of the Calibration Estimator...
SE(svystatTM(sbscal, ~va.imp2))
#...and they are identical, as it must be.
# Obviously the same result would hold for domain estimates (e.g. the total
# of va.imp2 for the "Agriculture" nace.macro).
# Compute the linearized variable of the Calibration Estimator of the domain
# total:
z_va.imp2.Agr <- get.residuals(sbscal, ~ I(va.imp2*(nace.macro=="Agriculture")),
scale="g")
# Now, treat z_va.imp2.Agr as an ordinary variable and compute the standard
# error of its HT total:
sbsdes<-des.addvars(sbsdes, z_va.imp2.Agr = z_va.imp2.Agr)
SE(svystatTM(sbsdes, ~z_va.imp2.Agr))
# Lastly, compute directly the standard error of the Calibration Estimator
# by domain...
SE(svystatTM(sbscal, ~va.imp2, ~nace.macro))
#...and the "Agriculture" SEs are identical, as it must be.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.