# surveysummary: Summary statistics for sample surveys In survey: Analysis of Complex Survey Samples

 surveysummary R Documentation

## Summary statistics for sample surveys

### Description

Compute means, variances, ratios and totals for data from complex surveys.

### Usage

``````## S3 method for class 'survey.design'
svymean(x, design, na.rm=FALSE,deff=FALSE,influence=FALSE,...)
## S3 method for class 'survey.design2'
svymean(x, design, na.rm=FALSE,deff=FALSE,influence=FALSE,...)
## S3 method for class 'twophase'
svymean(x, design, na.rm=FALSE,deff=FALSE,...)
## S3 method for class 'svyrep.design'
svymean(x, design, na.rm=FALSE, rho=NULL,
return.replicates=FALSE, deff=FALSE,...)
## S3 method for class 'survey.design'
svyvar(x, design, na.rm=FALSE,...)
## S3 method for class 'svyrep.design'
svyvar(x, design, na.rm=FALSE, rho=NULL,
return.replicates=FALSE,...,estimate.only=FALSE)
## S3 method for class 'survey.design'
svytotal(x, design, na.rm=FALSE,deff=FALSE,influence=FALSE,...)
## S3 method for class 'survey.design2'
svytotal(x, design, na.rm=FALSE,deff=FALSE,influence=FALSE,...)
## S3 method for class 'twophase'
svytotal(x, design, na.rm=FALSE,deff=FALSE,...)
## S3 method for class 'svyrep.design'
svytotal(x, design, na.rm=FALSE, rho=NULL,
return.replicates=FALSE, deff=FALSE,...)
## S3 method for class 'svystat'
coef(object,...)
## S3 method for class 'svrepstat'
coef(object,...)
## S3 method for class 'svystat'
vcov(object,...)
## S3 method for class 'svrepstat'
vcov(object,...)
## S3 method for class 'svystat'
confint(object,  parm, level = 0.95,df =Inf,...)
## S3 method for class 'svrepstat'
confint(object,  parm, level = 0.95,df =Inf,...)
cv(object,...)
deff(object, quietly=FALSE,...)
make.formula(names)
``````

### Arguments

 `x` A formula, vector or matrix `design` `survey.design` or `svyrep.design` object `na.rm` Should cases with missing values be dropped? `influence` Should a matrix of influence functions be returned (primarily to support `svyby`) `rho` parameter for Fay's variance estimator in a BRR design `return.replicates` Return the replicate means/totals? `deff` Return the design effect (see below) `object` The result of one of the other survey summary functions `quietly` Don't warn when there is no design effect computed `estimate.only` Don't compute standard errors (useful when `svyvar` is used to estimate the design effect) `parm` a specification of which parameters are to be given confidence intervals, either a vector of numbers or a vector of names. If missing, all parameters are considered. `level` the confidence level required. `df` degrees of freedom for t-distribution in confidence interval, use `degf(design)` for number of PSUs minus number of strata `...` additional arguments to methods,not currently used `names` vector of character strings

### Details

These functions perform weighted estimation, with each observation being weighted by the inverse of its sampling probability. Except for the table functions, these also give precision estimates that incorporate the effects of stratification and clustering.

Factor variables are converted to sets of indicator variables for each category in computing means and totals. Combining this with the `interaction` function, allows crosstabulations. See `ftable.svystat` for formatting the output.

With `na.rm=TRUE`, all cases with missing data are removed. With `na.rm=FALSE` cases with missing data are not removed and so will produce missing results. When using replicate weights and `na.rm=FALSE` it may be useful to set `options(na.action="na.pass")`, otherwise all replicates with any missing results will be discarded.

The `svytotal` and `svreptotal` functions estimate a population total. Use `predict` on `svyratio` and `svyglm`, to get ratio or regression estimates of totals.

`svyvar` estimates the population variance. The object returned includes the full matrix of estimated population variances and covariances, but by default only the diagonal elements are printed. To display the whole matrix use `as.matrix(v)` or ```print(v, covariance=TRUE)```.

The design effect compares the variance of a mean or total to the variance from a study of the same size using simple random sampling without replacement. Note that the design effect will be incorrect if the weights have been rescaled so that they are not reciprocals of sampling probabilities. To obtain an estimate of the design effect comparing to simple random sampling with replacement, which does not have this requirement, use `deff="replace"`. This with-replacement design effect is the square of Kish's "deft".

The design effect for a subset of a design conditions on the size of the subset. That is, it compares the variance of the estimate to the variance of an estimate based on a simple random sample of the same size as the subset, taken from the subpopulation. So, for example, under stratified random sampling the design effect in a subset consisting of a single stratum will be 1.0.

The `cv` function computes the coefficient of variation of a statistic such as ratio, mean or total. The default method is for any object with methods for `SE` and `coef`.

`make.formula` makes a formula from a vector of names. This is useful because formulas as the best way to specify variables to the survey functions.

### Value

Objects of class `"svystat"` or `"svrepstat"`, which are vectors with a `"var"` attribute giving the variance and a `"statistic"` attribute giving the name of the statistic.

These objects have methods for `vcov`, `SE`, `coef`, `confint`, `svycontrast`.

### Author(s)

Thomas Lumley

`svydesign`, `as.svrepdesign`, `svrepdesign` for constructing design objects.

`degf` to extract degrees of freedom from a design.

`svyquantile` for quantiles

`ftable.svystat` for more attractive tables

`svyciprop` for more accurate confidence intervals for proportions near 0 or 1.

`svyttest` for comparing two means.

`svycontrast` for linear and nonlinear functions of estimates.

### Examples

``````
data(api)

## one-stage cluster sample
dclus1<-svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc)

svymean(~api00, dclus1, deff=TRUE)
svymean(~factor(stype),dclus1)
svymean(~interaction(stype, comp.imp), dclus1)
svyquantile(~api00, dclus1, c(.25,.5,.75))
svytotal(~enroll, dclus1, deff=TRUE)
svyratio(~api.stu, ~enroll, dclus1)

v<-svyvar(~api00+api99, dclus1)
v
print(v, cov=TRUE)
as.matrix(v)

# replicate weights - jackknife (this is slower)
dstrat<-svydesign(id=~1,strata=~stype, weights=~pw,
data=apistrat, fpc=~fpc)
jkstrat<-as.svrepdesign(dstrat)

svymean(~api00, jkstrat)
svymean(~factor(stype),jkstrat)
svyvar(~api00+api99,jkstrat)

svyquantile(~api00, jkstrat, c(.25,.5,.75))
svytotal(~enroll, jkstrat)
svyratio(~api.stu, ~enroll, jkstrat)

# coefficients of variation
cv(svytotal(~enroll,dstrat))
cv(svyratio(~api.stu, ~enroll, jkstrat))

# extracting information from the results
coef(svytotal(~enroll,dstrat))
vcov(svymean(~api00+api99,jkstrat))
SE(svymean(~enroll, dstrat))
confint(svymean(~api00+api00, dclus1))
confint(svymean(~api00+api00, dclus1), df=degf(dclus1))

# Design effect
svymean(~api00, dstrat, deff=TRUE)
svymean(~api00, dstrat, deff="replace")
svymean(~api00, jkstrat, deff=TRUE)
svymean(~api00, jkstrat, deff="replace")
(a<-svytotal(~enroll, dclus1, deff=TRUE))
deff(a)

## weights that are *already* calibrated to population size
sum(weights(dclus1))
nrow(apipop)
cdclus1<- svydesign(id=~dnum, weights=~pw, data=apiclus1,
fpc=~fpc,calibrate.formula=~1)
SE(svymean(~enroll, dclus1))
## not equal to SE(mean)
SE(svytotal(~enroll, dclus1))/nrow(apipop)
## equal to SE(mean)
SE(svytotal(~enroll, cdclus1))/nrow(apipop)

``````

survey documentation built on May 3, 2023, 9:12 a.m.