svrepdesign  R Documentation 
Some recent largescale surveys specify replication weights rather than the sampling design (partly for privacy reasons). This function specifies the data structure for such a survey.
svrepdesign(variables , repweights , weights, data, degf=NULL,...)
## Default S3 method:
svrepdesign(variables = NULL, repweights = NULL, weights = NULL,
data = NULL, degf=NULL, type = c("BRR", "Fay", "JK1","JKn","bootstrap",
"ACS","successivedifference","JK2","other"),
combined.weights=TRUE, rho = NULL, bootstrap.average=NULL,
scale=NULL, rscales=NULL,fpc=NULL, fpctype=c("fraction","correction"),
mse=getOption("survey.replicates.mse"),...)
## S3 method for class 'imputationList'
svrepdesign(variables=NULL,
repweights,weights,data, degf=NULL,
mse=getOption("survey.replicates.mse"),...)
## S3 method for class 'character'
svrepdesign(variables=NULL,repweights=NULL,
weights=NULL,data=NULL, degf=NULL,
type=c("BRR","Fay","JK1", "JKn","bootstrap","ACS","successivedifference","JK2","other"),
combined.weights=TRUE, rho=NULL, bootstrap.average=NULL, scale=NULL,rscales=NULL,
fpc=NULL,fpctype=c("fraction","correction"),mse=getOption("survey.replicates.mse"),
dbtype="SQLite", dbname,...)
## S3 method for class 'svyrep.design'
image(x, ...,
col=grey(seq(.5,1,length=30)), type.=c("rep","total"))
variables 
formula or data frame specifying variables to include in the design (default is all) 
repweights 
formula or data frame specifying replication weights, or character string specifying a regular expression that matches the names of the replication weight variables 
weights 
sampling weights 
data 
data frame to look up variables in formulas, or character string giving name of database table 
degf 
Design degrees of freedom; use 
type 
Type of replication weights 
combined.weights 

rho 
Shrinkage factor for weights in Fay's method 
bootstrap.average 
For 
scale , rscales 
Scaling constant for variance, see Details below 
fpc , fpctype 
Finite population correction information 
mse 
If 
dbname 
name of database, passed to 
dbtype 
Database driver: see Details 
x 
survey design with replicate weights 
... 
Other arguments to 
col 
Colors 
type. 

In the BRR method, the dataset is split into halves, and the
difference between halves is used to estimate the variance. In Fay's
method, rather than removing observations from half the sample they
are given weight rho
in one halfsample and 2rho
in the
other. The ideal BRR analysis is restricted to a design where each
stratum has two PSUs, however, it has been used in a much wider class
of surveys. The scale
and rscales
arguments will be ignored (with a warning) if they are specified.
The JK1 and JKn types are both jackknife estimators deleting one cluster at a time. JKn is designed for stratified and JK1 for unstratified designs.
The successivedifference weights in the American Community Survey
automatically use scale = 4/ncol(repweights)
and rscales=rep(1,
ncol(repweights))
. This can be specified as type="ACS"
or
type="successivedifference"
. The scale
and rscales
arguments will be ignored (with a warning) if they are specified.
JK2 weights (type="JK2"
), as in the California Health Interview
Survey, automatically use scale=1
, rscales=rep(1, ncol(repweights))
.
The scale
and rscales
arguments will be ignored (with a warning) if they are specified.
Averaged bootstrap weights ("mean bootstrap") are used for some surveys from Statistics Canada. Yee et al (1999) describe their construction and use for one such survey.
The variance is computed as the sum of squared deviations of the
replicates from their mean. This may be rescaled: scale
is an
overall multiplier and rscales
is a vector of
replicatespecific multipliers for the squared deviations. That is,
rscales
should have one entry for each column of repweights
If thereplication weights incorporate the sampling weights
(combined.weights=TRUE
) or for type="other"
these must
be specified, otherwise they can be guessed from the weights.
A finite population correction may be specified for type="other"
,
type="JK1"
and type="JKn"
. fpc
must be a vector
with one entry for each replicate. To specify sampling fractions use
fpctype="fraction"
and to specify the correction directly use
fpctype="correction"
The design degrees of freedom are returned by degf
. By
default they are computed from the numerical rank of the
repweights. This is slow for very large data sets and you can specify
a value instead.
repweights
may be a character string giving a regular expression
for the replicate weight variables. For example, in the
California Health Interview Survey publicuse data, the sampling weights are
"rakedw0"
and the replicate weights are "rakedw1"
to
"rakedw80"
. The regular expression "rakedw[19]"
matches the replicate weight variables (and not the sampling weight
variable).
data
may be a character string giving the name of a table or view
in a relational database that can be accessed through the DBI
interface. For DBI interfaces dbtype
should be the name of the database
driver and dbname
should be the name by which the driver identifies
the specific database (eg file name for SQLite).
The appropriate database interface package must already be loaded (eg
RSQLite
for SQLite). The survey design
object will contain the replicate weights, but actual variables will
be loaded from the database only as needed. Use
close
to close the database connection and
open
to reopen the connection, eg, after
loading a saved object.
The database interface does not attempt to modify the underlying database and so can be used with readonly permissions on the database.
To generate your own replicate weights either use
as.svrepdesign
on a survey.design
object, or see
brrweights
, bootweights
,
jk1weights
and jknweights
The model.frame
method extracts the observed data.
Object of class svyrep.design
, with methods for print
,
summary
, weights
, image
.
To use replicationweight analyses on a survey specified by
sampling design, use as.svrepdesign
to convert it.
Levy and Lemeshow. "Sampling of Populations". Wiley.
Shao and Tu. "The Jackknife and Bootstrap." Springer.
Yee et al (1999). Bootstrat Variance Estimation for the National Population Health Survey. Proceedings of the ASA Survey Research Methodology Section. https://web.archive.org/web/20151110170959/http://www.amstat.org/sections/SRMS/Proceedings/papers/1999_136.pdf
as.svrepdesign
, svydesign
,
brrweights
, bootweights
data(scd)
# use BRR replicate weights from Levy and Lemeshow
repweights<2*cbind(c(1,0,1,0,1,0), c(1,0,0,1,0,1), c(0,1,1,0,0,1),
c(0,1,0,1,1,0))
scdrep<svrepdesign(data=scd, type="BRR", repweights=repweights, combined.weights=FALSE)
svyratio(~alive, ~arrests, scdrep)
## Not run:
## Needs RSQLite
library(RSQLite)
db_rclus1<svrepdesign(weights=~pw, repweights="wt[19]+", type="JK1", scale=(115/757)*14/15,
data="apiclus1rep",dbtype="SQLite", dbname=system.file("api.db",package="survey"), combined=FALSE)
svymean(~api00+api99,db_rclus1)
summary(db_rclus1)
## closing and reopening a connection
close(db_rclus1)
db_rclus1
try(svymean(~api00+api99,db_rclus1))
db_rclus1<open(db_rclus1)
svymean(~api00+api99,db_rclus1)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.