run_lcmcr: Calculate multiple systems estimation estimates using the...
In verdata: Analyze Data from the Truth Commission in Colombia

run_lcmcr

R Documentation

Calculate multiple systems estimation estimates using the Bayesian Non-Parametric Latent-Class Capture-Recapture model developed by Daniel Manrique-Vallier (2016).

Description

Calculate multiple systems estimation estimates using the Bayesian Non-Parametric Latent-Class Capture-Recapture model developed by Daniel Manrique-Vallier (2016).

Usage

run_lcmcr(
  stratum_data_prepped,
  stratum_name,
  min_n = 1,
  K,
  buffer_size,
  sampler_thinning,
  seed,
  burnin,
  n_samples,
  posterior_thinning
)

Arguments

`stratum_data_prepped`	A data frame with all records in the stratum of interest documented by sources considered valid for estimation (i.e., there should be no rows with all 0's). Columns indicating sources should be prefixed with `in_` and should be numeric with 1 indicating that an individual was documented in the source and 0 indicating that an individual was not documented in the source.
`stratum_name`	An identifier for the stratum.
`min_n`	The minimum number of records that must appear in a source to be considered valid for estimation. `min_n` should never be less than or equal to 0; the default value is 1.
`K`	The maximum number of latent classes to fit.
`buffer_size`	Size of the tracing buffer.
`sampler_thinning`	Thinning interval for the tracing buffer.
`seed`	Integer seed for the internal random number generator.
`burnin`	Number of burn in iterations.
`n_samples`	Number of samples to be generated. Samples are taken one every `posterior_thinning` iterations of the sampler. Final number of samples from the posterior is `n_samples` divided by 1,000.
`posterior_thinning`	Thinning interval for the sampler.

Value

A data frame with four columns and n_samples divided by 1,000 rows. N is the draws from the posterior distribution, valid_sources is a string indicating which sources were used in the estimation, n_obs is the number of observations in the stratum of interest, and stratum_name is the stratum identifier.

References

\insertRef

manriquevallier2016verdata

Examples


set.seed(19481210)
library(dplyr)

in_A <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.45, 0.65))
in_B <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.5, 0.5))
in_C <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(0.75, 0.25))
in_D <- sample(c(0, 1), size = 100, replace = TRUE, prob = c(1, 0))

my_stratum <- tibble::tibble(in_A, in_B, in_C, in_D) %>%
    dplyr::mutate(rs = rowSums(.)) %>%
    dplyr::filter(rs >= 1) %>%
    dplyr::select(-rs)
run_lcmcr(stratum_data_prepped = my_stratum, stratum_name = "my_stratum",
          K = 4, buffer_size = 10000, sampler_thinning = 1000, seed = 19481210,
          burnin = 10000, n_samples = 10000, posterior_thinning = 500)

verdata documentation built on June 8, 2025, 11:46 a.m.