The srvyr package provides a new way of calculating summary statistics on survey data, based on the dplyr package. There are three stages to using srvyr functions, creating a survey object, manipulating the data, and calculating survey statistics.
as_survey_twophase are used to create surveys based on
a data.frame and design variables, replicate weights or two phase design
respectively. Each is based on a function in the survey package
twophase), and it is easy to modify code that uses
the survey package so that it works with the srvyr package. See
vignette("srvyr_vs_survey") for more details.
as_survey will choose between the other three
functions based on the arguments given to save some typing.
Once you've created a survey object, you can manipulate the data as you would
using dplyr with a data.frame.
mutate modifies or creates a variable,
rename select or rename variables, and
filter keeps certain observations.
arrange and two table verbs such as
bind_cosl, or any of the joins are not usable on survey objects
because they might require modificaitons to the definition of your survey. If
you need to use these variables, you should do so before you convert the
data.frame to a survey object.
Now that you have your data set up correctly, you can calculate summary
statistics. To get the statistic over the whole population, use
summarise, or to calculate it over a set of groups, use
You can calculate the mean, (with
survey_mean), the total
survey_total), the quantile (
or a ratio (
survey_ratio). By default, srvyr will return the
statistic and the standard error around it in a data.frame, but with the
vartype parameter, you can also get a confidence interval ("ci"),
variance ("var"), or coefficient of variation ("cv").
Within summarise, you can also use
unweighted, which calculates
a function without taking into consideration the survey weighting.