Description Usage Arguments Details Value Note Author(s) References See Also Examples
Calculates estimates, standard errors and confidence intervals for totals and means in subpopulations.
1 2 3 |
deskott |
Object of class |
y |
Formula defining the variables of interest. |
by |
Formula specifying the variables that define the "estimation domains". If |
estimator |
|
vartype |
|
conf.int |
Boolean ( |
conf.lev |
Probability specifying the desired confidence level: the default value is |
This function calculates weighted estimates for totals and means using suitable weights depending on the class of deskott
: calibrated weights for class kott.cal.design
and direct weights otherwise. Standard errors are calculated using the extended DAGJK method [Kott 99-01].
The mandatory argument y
identifies the variables of interest, that is the variables for which estimates are to be calculated. The corresponding formula must be of the type y=~var1+...+varn
. The deskott
variables referenced by y
must be numeric
or factor
and must not contain any missing value (NA
). It is admissible to specify for y
"mixed" formulas that simultaneously contain quantitative (numeric
) variables and qualitative (factor
) variables.
The optional argument by
specifies the variables that define the "estimation domains", that is the subpopulations for which the estimates are to be calculated. If by=NULL
(the default option), the estimates produced by kottby
refer to the whole population. Estimation domains must be defined by a formula: for example the statement by=~B1:B2
selects as estimation domains the subpopulations determined by crossing the modalities of variables B1
and B2
. The deskott
variables referenced by by
(if any) must be factor
and must not contain any missing value (NA
).
The optional argument estimator
makes it possible to select the desired estimator. If
estimator="total"
(the default option), kottby
calculates, for a given variable of interest vark
, the estimate of the total (when vark
is numeric
) or the estimate of the absolute frequency distribution (when vark
is factor
). Similarly, if estimator="mean"
, the function calculates the estimate of the mean (when vark
is numeric
) or the the estimate of the relative frequency distribution (when vark
is factor
).
The conf.int
argument allows to request the confidence intervals for the estimates. By default conf.int=FALSE
, that is the confidence intervals are not provided.
Whenever confidence intervals are requested (i.e. conf.int=TRUE
), the desired confidence level can be specified by means of the conf.lev
argument. The conf.lev
value must represent a probability (0<=conf.lev<=1
) and its default is chosen to be 0.95
.
The return value depends on the value of the input parameters. In the most general case, the function returns an object of class list
(typically a list made up of data frames).
The advantage of the DAGJK method over the traditional jackknife is that, unlike the latter, it remains computationally manageable even when dealing with "complex and big" surveys (tens of thousands of PSUs arranged in a large number of strata with widely varying sizes). In fact, the DAGJK method is known to provide, for a broad range of sampling designs and estimators, (near) unbiased standard error estimates even with a "small" number (e.g. a few tens) of replicate weights. On the other hand, if the number of replicates is not large, it seems defensible to use a t distribution (rather than a normal distribution) for calculating the confidence intervals. In line with what was proposed in [Kott 99-01], given an input kott.design
object with nrg
random groups, kottby
builds the confidence intervals making use of a t distribution with nrg-1
degrees of freedom.
Diego Zardetto
Kott, Phillip S. (1999) "The Extended Delete-A-Group Jackknife". Bulletin of the International Statistical Instititute. 52nd Session. Contributed Papers. Book 2, pp. 167-168.
Kott, Phillip S. (2001) "The Delete-A-Group Jackknife". Journal of Official Statistics, Vol.17, No.4, pp. 521-526.
kott.ratio
for estimating ratios between totals, kott.quantile
for estimating quantiles, kott.regcoef
for estimating regression coefficients and kottby.user
for calculating estimates based on user-defined estimators.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | data(data.examples)
# Creation of a kott.design object:
kdes<-kottdesign(data=example,ids=~towcod+famcod,strata=~SUPERSTRATUM,
weights=~weight,nrg=15)
# Estimate of the total of 3 quantitative variables for the whole
# population:
kottby(kdes,~y1+y2+y3)
# Estimate of the total of the same 3 variables by sex:
kottby(kdes,~y1+y2+y3,~sex)
# Estimate of the mean of the same 3 variables by marstat and sex:
kottby(kdes,~y1+y2+y3,~marstat:sex,estimator="mean")
# Estimate of the absolute frequency distribution of the qualitative
# variable age5c for the whole population:
kottby(kdes,~age5c)
# Estimate of the relative frequency distribution of the qualitative
# variable marstat by sex:
kottby(kdes,~marstat,~sex,estimator="mean")
# The same with confidence intervals at a confidence level of 0.9:
kottby(kdes,~marstat,~sex,estimator="mean",conf.int=TRUE,conf.lev=0.9)
# Quantitative and qualitative variables together: estimate of the
# total for y3 and of the absolute frequency distribution of marstat,
# by sex:
kottby(kdes,~y3+marstat,~sex)
# Lonely PSUs do not give rise to NaNs in the standard errors:
kdes.lpsu<-kottdesign(data=example,ids=~towcod+famcod,strata=~stratum,
weights=~weight,nrg=15)
kottby(kdes.lpsu,~x1+x2+x3)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.