View source: R/rescaled.bootstrap.R
rescaled.bootstrap | R Documentation |
Draw bootstrap replicates from survey data using the rescaled bootstrap for stratified multistage sampling, presented by Preston, J. (2009).
rescaled.bootstrap(
dat,
REP = 1000,
strata = "DB050>1",
cluster = "DB060>DB030",
fpc = "N.cluster>N.households",
single.PSU = c("merge", "mean"),
return.value = c("data", "replicates"),
run.input.checks = TRUE,
already.selected = NULL,
seed = NULL
)
dat |
either data frame or data table containing the survey sample |
REP |
integer indicating the number of bootstraps to be drawn |
strata |
string specifying the column name in |
cluster |
string specifying the column name in |
fpc |
string specifying the column name in |
single.PSU |
either "merge" or "mean" defining how single PSUs need to
be dealt with. For |
return.value |
either "data", "replicates" and/or "selection"
specifying the return value of the function. For "data" the survey data is
returned as class |
run.input.checks |
logical, if TRUE the input will be checked before applying the bootstrap procedure |
already.selected |
list of data.tables or |
seed |
integer specifying the seed for the random number generator. |
For specifying multistage sampling designs the column names in
strata
,cluster
and fpc
need to be seperated by ">".
For multistage sampling the strings are read from left to right meaning that
the first vector entry or column name before the first ">" is taken as the column for
stratification/clustering/number of PSUs at the first and the last vector entry
or column after
the last ">" is taken as the column for stratification/clustering/number of
PSUs at the last stage.
If for some stages the sample was not stratified or clustered one must
specify this by "1" or "I", e.g. strata=c("strata1","I","strata3")
or
strata=c("strata1>I>strata3")
if there was
no stratification at the second stage or
cluster=c("cluster1","cluster2","I")
respectively
cluster=c("cluster1>cluster2>I")
if there were no clusters at the last stage.
The number of PSUs at each stage is not calculated internally and must be
specified for any sampling design.
For single stage sampling using stratification this can usually be done by
adding over all sample weights of each PSU by each strata-code.
Spaces in each of the strings will be removed, so if column names contain
spaces they should be renamed before calling this procedure!
If already.selected
is supplied the sampling of bootstrap replicates
considers if speficif PSUs have already been selected by a previous survey wave.
For a specific strata
and cluster
this could lead to more than floor(n/2)
records selected. In that case records will be de-selected such that floor(n/2)
records,
with n
as the total number of records, are selected for each
strata
and cluster
. This parameter ist mostly used by draw.bootstrap in
order to consider the rotation of the sampling units over time.
returns the complete data set including the bootstrap replicates or
just the bootstrap replicates, depending on return.value="data"
or
return.value="replicates"
respectively.
Johannes Gussenbauer, Statistics Austria
Preston, J. (2009). Rescaled bootstrap for stratified multistage sampling. Survey Methodology. 35. 227-234.
library(surveysd)
library(data.table)
setDTthreads(1)
set.seed(1234)
eusilc <- demo.eusilc(n = 1,prettyNames = TRUE)
eusilc[,N.households:=uniqueN(hid),by=region]
eusilc.bootstrap <- rescaled.bootstrap(eusilc,REP=10,strata="region",
cluster="hid",fpc="N.households")
eusilc[,new_strata:=paste(region,hsize,sep="_")]
eusilc[,N.housholds:=uniqueN(hid),by=new_strata]
eusilc.bootstrap <- rescaled.bootstrap(eusilc,REP=10,strata=c("new_strata"),
cluster="hid",fpc="N.households")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.