knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" ) library(surveysampler) library(tibble)
Studies that evaluate survey sampling and analysis approaches require varied techniques and methods not found in a single package in R. This package provides utilities that aid and simplify these approaches to enable streamlined assessment and comparison of different survey sampling and analysis techniques.
This package has been developed in support of a Medecins Sans Frontieres UK study on the impact of probability proportional to population size (PPS) sampling on health and nutrition surveys particularly in contexts of humanitarian emergencies.
surveysampler
is not yet available from CRAN but the development version is available from GitHub and can be installed with:
if (!require(remotes)) install.packages("remotes") remotes::install_github("ernestguevarra/surveysampler")
Given a dataset from a typical health and nutrition survey with a sample that has been drawn using probability proportional to population size (PPS) and a dataset consisting of all the potential sampling units with their population sizes from which the survey sample was taken, we develop two approaches to recreate an unweighted survey sample. Such approaches allow for the use of readily available PPS-drawn datasets in studies that aim to test the impact of PPS samples on health and nutrition indicators measurement.
Using the probability density of the populations of all the potential sampling units from which a specific survey sample was drawn from, we accept or reject a sampling unit from the survey sample if it matches the probability density of the populations of potential sampling units. The idea here is that we pick sampling units that we might get from a random or systematic sample of potential sampling units.
We developed the function accept_reject_sample()
for this purpose. The function requires two datasets:
village_list
sample_data
The function can be used as follows:
accept_reject_psu( x = village_list, svy = sample_data, psu = c("id", "psu"), match = "cluster", pop = "population", verbose = FALSE, show_plot = TRUE )
and returns a plot of the accepted and rejected samples against the probability density of the populations, and the simulated unweighted survey sample like below:
accept_reject_psu( x = village_list, svy = sample_data, psu = c("id", "psu"), match = "cluster", pop = "population", verbose = FALSE, show_plot = TRUE )
Using a dataset of all potential sampling units and their population sizes from which a specific survey sample was drawn from, we draw a simple random sample or a systematic sample and then match with the survey sample based on propensity scores of their population sizes. The simulated survey sample is then created from sampling units from the survey sample that have been directly selected or that match the potential sampling units that are not in the survey sample.
We developed the function create_sample_psm()
for this purpose which can be used as follows:
create_sample_psm( x = village_list, svy = sample_data, psu = c("id", "psu"), match = "cluster", pop = "population" )
and returns a simulated unweighted survey sample like below:
create_sample_psm( x = village_list, svy = sample_data, psu = c("id", "psu"), match = "cluster", pop = "population" )
If you find the surveysampler
package useful please cite using the suggested citation provided by a call to the citation
function as follows:
citation("surveysampler")
The surveysampler
package is distributed under the GPL-3 license.
Feedback, bug reports and feature requests are welcome; file issues or seek support here. If you would like to contribute to the package, please see our contributing guidelines.
Please note that the surveysampler
project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.