svystat | R Documentation |
Computes many estimates and errors (e.g. for disparate estimation domains) in just a single shot, primarily to use them in fitting GVF models. Can handle estimators of all kinds.
svystat(design, kind = c("TM", "R", "S", "SR", "B", "Q", "L", "Sigma", "Sigma2"),
by = NULL, group = NULL, forGVF = TRUE,
combo = -1, ...)
## S3 method for class 'gvf.input.gr'
plot(x, ...)
## S3 method for class 'svystat.gr'
coef(object, ...)
## S3 method for class 'svystat.gr'
SE(object, ...)
## S3 method for class 'svystat.gr'
VAR(object, ...)
## S3 method for class 'svystat.gr'
cv(object, ...)
## S3 method for class 'svystat.gr'
deff(object, ...)
## S3 method for class 'svystat.gr'
confint(object, ...)
design |
Object of class |
kind |
|
by |
Formula specifying the variables that define the "estimation domains". If |
group |
Formula specifying a partition of the population into "groups": the output will be returned separately for each group.
If |
forGVF |
Select |
combo |
An |
... |
For function |
x |
The object of class |
object |
An object of class |
This function can compute all the summary statistics provided by ReGenesees, and is principally meant to return a lot of them in just a single shot.
If forGVF = TRUE
the output will be ready to feed ReGenesees GVF fitting infrastructure, otherwise it will consist simply of a set of summary statistic objects.
Use argument kind
to specify the summary statistic you need. The default value 'TM'
selects function svystatTM
, which yields Totals and Means. All the arguments needed by the summary statistic function implied by kind
(e.g. argument y
for svystatTM
when kind = 'TM'
) will be passed on through argument ‘...
’.
As usual in summary statistics, argument by
can be used to request domain estimates.
The group
formula (if any) specifies a way of partitioning the population into groups: the output will be reported separately for each group. In the GVF context, a “grouped” output will permit to fit separate GVF models inside different groups (and hence to compute separate variance predictions for different groups).
Note that group
and by
share identical syntax and semantics as model formulae, despite they have different purposes in function svystat
(as explained above).
Parameter combo
is only meaningful if by
is passed. Its purpose is to allow computing estimates and errors simultaneously for many estimation domains.
If the by
formula involves n
variables, specifying combo = m
requests to compute outputs for all the domains determined by all the interactions of by
variables up to order m
(with -1 <= m <= n
), as follows:
COMBO MEANING m = -1.......'no combo', i.e. treat 'by' formula as usual (the default); m = 0.......'order zero' combination, i.e. just a single domain: the whole population; m = 1.......'order zero' plus 'order one' combinations, the latter being all the marginal domains defined by 'by' variables; m = n........combinations of any order, the maximum being the one with all 'by' variables interacting simultaneously.
The plot
method can be used only when forGVF = TRUE
and produces a matrix (or many matrices, if group
is passed) of scatterplots with polynomial smoothers.
Methods coef
, SE
, VAR
, cv
, deff
, and confint
can be used only when forGVF = FALSE
, to extract estimates and variability statistics.
An object storing estimates and errors, whose detailed structure depends on input parameters' values.
If forGVF = FALSE
, a set of summary statistics possibly stored into a list (with class svystat.gr
in the most general case).
If forGVF = TRUE
and argument group
is not passed, an object of class gvf.input
.
If forGVF = TRUE
and argument group
is passed, an object of class gvf.input.gr
. This is a list of objects of class gvf.input
, each one pertaining to a different population group.
Diego Zardetto
estimator.kind
to assess what kind of estimates are stored inside a survey statistic object, gvf.input
as an alternative to prepare the input for GVF model fitting, GVF.db
to manage ReGenesees archive of registered GVF models, fit.gvf
to fit GVF models, plot.gvf.fit
to get diagnostic plots for fitted GVF models, drop.gvf.points
to drop alleged outliers from a fitted GVF model and simultaneously refit it, and predictCV
to predict CV values via fitted GVF models.
# Load sbs data:
data(sbs)
# Create a design object:
sbsdes<-e.svydesign(data=sbs,ids=~id,strata=~strata,weights=~weight,fpc=~fpc)
##########################################################################
# svystat as an alternative way to compute 'ordinary' summary statistics #
##########################################################################
## Total number of employees
svystat(sbsdes,y=~emp.num,forGVF=FALSE)
# equivalent to:
svystatTM(sbsdes,y=~emp.num)
## Average number of employees per enterprise
svystat(sbsdes,y=~emp.num,estimator="Mean",forGVF=FALSE)
# equivalent to:
svystatTM(sbsdes,y=~emp.num,estimator="Mean")
## Average value added per employee by economic activity macro-sector
## (nace.macro):
svystat(sbsdes,kind="R",num=~va.imp2,den=~emp.num,by=~nace.macro,forGVF=FALSE)
# equivalent to:
svystatR(sbsdes,num=~va.imp2,den=~emp.num,by=~nace.macro)
## Counts of employees by classes of number of employees (emp.cl) crossed
## with economic activity macro-sector (nace.macro):
svystat(sbsdes,y=~emp.num,by=~emp.cl:nace.macro,forGVF=FALSE)
# equivalent to:
svystatTM(sbsdes,y=~emp.num,by=~emp.cl:nace.macro)
## Provided forGVF = FALSE, you can use estimator.kind on svystat output:
stat<-svystat(sbsdes,kind="R",num=~va.imp2,den=~emp.num,by=~emp.cl:nace.macro,
group=~region,forGVF=FALSE)
stat
estimator.kind(stat,sbsdes)
##########################################################
# Understanding syntax and semantics of argument 'combo' #
##########################################################
# Load household data:
data(data.examples)
# Create a design object:
houdes<-e.svydesign(data=example,ids=~towcod+famcod,strata=~SUPERSTRATUM,
weights=~weight)
# Add convenience variable 'ones' to estimate counts:
houdes<-des.addvars(houdes,ones=1)
## To facilitate understanding, let's for the moment keep forGVF = FALSE.
## Let's use estimates and errors of counts of individuals by sex and
## five age classes (age5c):
svystat(houdes,y=~ones,by=~age5c:sex,forGVF=FALSE)
## Now let's play with argument 'combo':
# combo = -1
# -> 'no combo', i.e. treat 'by' formula as usual
svystat(houdes,y=~ones,by=~age5c:sex,forGVF=FALSE,combo=-1)
# combo = 0
# -> 'order zero' combination, i.e. just a single domain: the whole population
svystat(houdes,y=~ones,by=~age5c:sex,forGVF=FALSE,combo=0)
# combo = 1
# -> 'order zero' plus 'order one' combinations, the latter being all the
# marginal domains defined by 'by' variables
svystat(houdes,y=~ones,by=~age5c:sex,forGVF=FALSE,combo=1)
# combo = 2
# -> since 'by' has 2 variables, this means combinations of any order up to
# the maximum
svystat(houdes,y=~ones,by=~age5c:sex,forGVF=FALSE,combo=2)
# combo = 3
# -> yields an error, as 'combo' cannot exceed the number of 'by' variables
# (2 in this example)
## Not run:
svystat(houdes,y=~ones,by=~age5c:sex,forGVF=FALSE,combo=3)
## End(Not run)
######################################################################
# svystat as an alternative way to prepare input data for GVF models #
######################################################################
## The same estimates and errors of the last example above, now with
## forGVF = TRUE: note the different output data format
svystat(houdes,y=~ones,by=~age5c:sex,combo=2)
## Note that the agile command above is indeed equivalent to the following
## lengthier, cumbersome statement:
gvf.input(houdes,
svystatTM(houdes,y=~ones),
svystatTM(houdes,y=~ones,by=~age5c),
svystatTM(houdes,y=~ones,by=~sex),
svystatTM(houdes,y=~ones,by=~age5c:sex)
)
################################################
# Using argument 'group' to prepare input data #
# for separate GVF models #
################################################
## The same estimates and errors of the last example above, now prepared
## separately for different regions (regcod):
svystat(houdes,y=~ones,by=~age5c:sex,combo=2,group=~regcod)
## Again the same estimates and errors, prepared separately for groups
## defined crossing marital status (marstat) and region:
svystat(houdes,y=~ones,by=~age5c:sex,combo=2,group=~marstat:regcod)
## NOTE: Output has class "gvf.input.gr". This will tell ReGenesees' GVF
## fitting facilities to handle estimates and errors pertaining to
## different groups independently of each other.
## NOTE: Parameter combo allows svystat to gather a huge amount of estimates and
## errors in just a single slot, as the number of estimation domains grows
## exponentially with the number of by variables.
## See, for instance, the following example:
out <- svystat(houdes,y=~ones,by=~age5c:marstat:sex:regcod,combo=4)
dim(out)
head(out)
plot(out)
##################################################
# Minor details: accessor functions and plotting #
##################################################
## Accessor functions work only when forGVF = FALSE
# Average value added per employee by nace.macro:
out <- svystat(sbsdes,kind="R",num=~va.imp2,den=~emp.num,by=~nace.macro,forGVF=FALSE)
out
# Access CV values and confidence intervals:
cv(out)
confint(out)
# The same as above, separately for regions:
out <- svystat(sbsdes,kind="R",num=~va.imp2,den=~emp.num,by=~nace.macro,group=~region,forGVF=FALSE)
out
# Access CV values and confidence intervals:
cv(out)
confint(out)
## Plot function works only when forGVF = TRUE
# Counts of individuals by sex, marstat and age5c, and all their interactions:
out <- svystat(houdes,y=~ones,by=~age5c:marstat:sex,combo=3)
# Plot GVF input:
plot(out)
# The same as above, grouped by region:
out <- svystat(houdes,y=~ones,by=~age5c:marstat:sex,combo=3,group=~regcod)
# Plot GVF inputs, separately by groups (regions):
plot(out)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.