knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
This repository contains all of the code and analyses for the paper titled “On the Reliability of Multiple Systems Estimation for the Quantification of Modern Slavery” (Binette and Steorts, 2021). It is structured as follows:
The R package MSETools provides a unified interface to multiple systems estimation (MSE) software. It implements best usage practices and computational speedups. Data from multiple system estimation studies of human trafficking has been reproduced for illustrations and analyses.
The analyses folder contains the analyses and figures for Binette
and Steorts (2021). Each analysis is provided as an Rmd document
which can be knitted on any platform using the included cache.
Figures are saved to png and pdf format into subfolders. Cache can
be regenerated by knitting the Rmd documents on computing clusters
using SLURM. To run the entire analysis from scratch, make sure that the MSETools package is installed (see instructions below) and use the command cd analyses && make clear_cache && srun make
. Long-running programs on a cluster can be run within a detachable terminal (e.g. tmux
) to avoid connection issues.
You can install the development version of MSETools from GitHub:
# install.packages("devtools") devtools::install_github("OlivierBinette/MSETools")
dga()
function provides an interface to the dga
package of Lum, Johndrow and Ball (2015). The package implements decomposable graphical models with hyper-Dirichlet priors and Bayesian model averaging. Here it has been re-implemented in Rcpp and extended to allow more flexible prior distributions.lcmcr()
function provides an interface to the
LCMCR
package of Manrique-Vallier (2020), which implements the
latent class model of Manrique-Vallier (2016). By default,
MSETools initializes 200 parallel MCMC chains to provide
cross-replication stability – this is necessary since the LCMCR
Gibbs sampler fails to converge in some cases. Convergence
diagnostics are available through MSETools::diagnostics()
.sparsemse()
function provides an interface to
the SparseMSE
package of Chan, Silverman and Vincent (2019), which
implements a Poisson log-linear approach with stepwise model
selection and bootstrap confidence intervals.estimates()
computes point estimates and confidence intervals for
a list of models.batch.estimates()
compute estimates as a SLURM job array for use in a cluster.diagnostics()
computes convergence diagnostics for lcmcr
objects.See Binette and Steorts (2021) for a description of the datasets reproduced herein.
library(MSETools)
Define a list of models fitted to the UK dataset:
models = list(lcmcr(UK), sparsemse(UK), dga(UK), independence(UK))
Compute estimates:
estimates(models)
Parallelize the computation of estimates on a computing cluster:
batch.estimates(models, njobs=4)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.