eforensics: Election Forensics Finite Mixture Model
In UMeforensics/eforensics_public: Election Forensics: Positive Empirical Models of Election Fraud

Description Usage Arguments Value References Examples

This function estimates a finite mixture model of election fraud

eforensics(formula1, formula2, formula3 = NULL, formula4 = NULL,
  formula5 = NULL, formula6 = NULL, data, eligible.voters = NULL,
  weights = NULL, mcmc, model = "qbl", parameters = "all",
  na.action = "exclude", get.dic = 1000, parComp = TRUE,
  autoConv = TRUE, max.auto = 10, mcmc.conv.diagnostic = "MCMCSE",
  mcmc.conv.parameters = "pi", mcmcse.conv.precision = 0.05,
  mcmcse.combine = FALSE)

`formula1`	an object of the class `formula` as used in `lm`. The dependent variable of this formula must the number (counts) or proportion of votes for the party or candidate that won the election. If counts are used, the model must be from the binomial family (see `model` parameter below). If proportions are provided, the model must be from the normal family (see `model` parameter below)
`formula2`	an object of the class `formula` as used in `lm`. The dependent variable of this formula must the number (counts) or proportion of abstention. The type (count or proportion) must be the same as the independent variable in `formula1`
`formula3`	See description below
`formula4`	See description below
`formula5`	See description below
`formula6`	See description below Formulas 3 to 6 There are four other possible formulas to use: formula3, formula4, formula5, formula6 formula3 an object of the class `formula` as used in `lm`. The left-hand side (LHS) of the formula must be mu.iota.m (see example). The mu.iota.m is the probability of incremental fraud by manufacturing votes and it is a latent variable in the model. By specifying the LHS with that variable, the functional automatically identifies that formula as formula3. Default is `NULL` and it means that probability is not affected by election unit (ballot box, polling place, etc) covariate formula4 an object of the class `formula` as used in `lm`. The left-hand side (LHS) of the formula must be mu.iota.s (see example). The mu.iota.s is the probability of incremental fraud by stealing votes from the opposition and it is a latent variable in the model. By specifying the LHS with that variable, the functional automatically identifies that formula as formula4. Default is `NULL` and it means that probability is not affected by election unit (ballot box, polling place, etc) covariate formula5 an object of the class `formula` as used in `lm`. The left-hand side (LHS) of the formula must be mu.chi.m (see example). The mu.chi.m is the probability of extreme fraud by manufacturing votes and it is a latent variable in the model. By specifying the LHS with that variable, the functional automatically identifies that formula as formula5. Default is `NULL` and it means that probability is not affected by election unit (ballot box, polling place, etc) covariate formula6 an object of the class `formula` as used in `lm`. The left-hand side (LHS) of the formula must be mu.chi.s (see example). The mu.chi.s is the probability of extreme fraud by stealing votes from the opposition and it is a latent variable in the model. By specifying the LHS with that variable, the functional automatically identifies that formula as formula6. Default is `NULL` and it means that probability is not affected by election unit (ballot box, polling place, etc) covariate
`data`	a data.frame with the independent variables (voters for the winner and abstention) and the covariates. If the independent variables are counts, the it is necessary to provide the total number of eligible voters (see parameter `eligible.voters`)
`eligible.voters`	string with the name of the variable in the data that contains the number of eligible voters. Default is `NULL`, but it is required if the independent variables (voters for the winner and abstention) are counts
`weights`	Deprecated.
`mcmc`	a list containing `n.iter`, which is the number of iterations for the MCMC, `burn.in` for the burn-in period of the MCMC chain, `n.adapt` indicating the number of adaptative steps before the estimation (see `rjags`), and `n.chains`, an integer indicating the number of chains to use (default 1).
`model`	a string with the model ID to use in the estimation. There are three current choices: `qbl`, `bl`, and `rn`. `qbl` is the default and recommended choice. For a description of each model, see `ef_models_desc`.
`parameters`	a string vector with the names of the parameters to monitor. When `NULL`, it will monitor all the parameters, except the Z's. When `parameters='all'` (default), it will monitor all parameters, including Z, which is necessary to classify the observations as fraudulent cases or not.
`na.action`	Deprecated.
`get.dic`	Deprecated.
`parComp`	Logical. If `parComp = TRUE`, then chains are computed in parallel using the runjags parallel method. This opens `n.chains` instances of JAGS. In practice, a max of 4 unique chains can be run due to the way in which JAGS generates initial values. If `parComp = FALSE`, chains are run sequentially using the runjags interruptible method.
`autoConv`	Logical. If `autoConv = TRUE`, chains are run until convergence criteria are met. Currently, chains are run for a single period equal to `burn.in` iterations and monitored for `n.iter` iterations. If `mcmc.conv.diagnostic = "MCMCSE"`, MCMCSE values are calculated for each parameter in `mcmc.conv.diagnostic`. If all values are less than `mcmcse.conv.precision` then the chain is stopped and the chain is run for `n.iter` more iterations monitoring all values specified by `parameters`. If the MCMCSE for any parameter is higher than `mcmcse.conv.precision`, then the chain is run for `burn.in` + `n.iter` more iterations and the MCMCSE is again checked. This is repeated, at most, `max.auto` times. If the MCMCSE condition is not met by `max.auto` attempts, a warning message is printed and the chains are run `n.iter` more times with all parameters monitored. If `mcmc.conv.diagnostic = "PSRF"`, the same procedure occurs checking that all PSRF values are less than 1.05.
`max.auto`	Integer. Number of subsequent tries to achieve the convergence conditions outlined by `autoConv`. After `max.auto` failures, a warning is thrown and the chain is run `n.iter` more times monitoring all specified parameters.
`mcmc.conv.diagnostic`	a string with the method to use to evaluate convergence. Currenctly, `PSRF` and `MCMCSE` (default) are implemented.
`mcmc.conv.parameters`	string vector with the name of the parameters to check for convergence using the MCMC standard error. Default is `pi`,
`mcmcse.conv.precision`	numeric, the value of the precision criterion to evaluate convergence using the MCMC standard error. The MCMC std. error of all parameters included in `mcmcse.conv.parameter` must be below the threshold defined by the value of `mcmc.conv.precision` (default is 0.05) to pass the convergence diagnostic.
`mcmcse.combine`	boolean, if `TRUE`, the MCMCSE is computed after the chains are combined. Otherwise, the MCMC std. error is computed for each chain, and the maximum std. error of each parameter is used for the diagnostic

The function returns a nested list of class eforensics with length equal to the number of chains. Each sublist contains up to three named objects:

parameters: A mcmc object that contains the posterior draws for all monitored parameters except for the individual fraud classifications.
k.hat: A vector that contains the posterior modal classification for each observation. 1 corresponds to no fraud, 2 corresponds to incremental fraud, and 3 corresponds to extreme fraud.
piZi: A matrix with three columns that contains the posterior probability of belonging to each class for each observation.

If model = "qbl" or model = "bl", the proportion of frauds estimated at each observation is returned. These values can be accessed for object foo using attr(foo,"frauds"). This attribute is a two element list that contains the estimated proportion of votes that are Stolen and Manufactured. Posterior means, HPD intervals, and posterior quantiles are returned for each observation in the data set. These quantities are automatically aggregated over all chains.

Flegal, J. M., Haran, M., & Jones, G. L., Markov chain monte carlo: can we trust the third significant figure?, Statistical Science, 23(2), 250–260 (2008). Brooks, S. P., & Gelman, A., General methods for monitoring convergence of iterative simulations, Journal of computational and graphical statistics, 7(4), 434–455 (1998).

set.seed(12345)
library(eforensics)
model    = 'qbl'

## simulate data
## -------------
set.seed(12345)
sim_data = ef_simulateData(n=250, nCov=1, nCov.fraud=1,
              model="bbl", overdispersion = 100, pi = c(.95,.04,.01))
data     = sim_data$data

## mcmc parameters
## ---------------
mcmc    = list(burn.in=1000, n.adapt=1000, n.iter=1000, n.chains=2)

## samples
## -------
## help(eforensics)

samples    = eforensics(
  w ~ x1.w ,
  a ~ x1.a,
  mu.iota.m ~ x1.iota.m,
  mu.iota.s ~ x1.iota.s,
  mu.chi.m  ~ x1.chi.m,
  mu.chi.s  ~ x1.chi.s,
  data=data,
  eligible.voters="N",
  model="qbl",
  mcmc=mcmc,
  parameters = "all",
  parComp = TRUE,
  autoConv = TRUE,
  max.auto = 10,
  mcmc.conv.diagnostic = "MCMCSE",
  mcmc.conv.parameters = c("pi"),
  mcmcse.conv.precision = .05,
  mcmcse.combine = FALSE
)

#Summaries for each of the monitored parameters
#Look at each chain separately
summary(samples)
#Combine the chains
summary(samples, join.chains=T)

#Look at the estimated fraud proportions for each observation
attr(samples,"frauds")
#Look at Manufactured and Stolen separately
attr(samples,"frauds")$Manufactured
attr(samples,"frauds")$Stolen

#How accurate is the classification?

#Get the true categories
true_z <- sim_data$latent$z

#What is the modal estimate for the class?
num_z <- (samples[[1]]$piZi*1000) + (samples[[2]]$piZi*1000)
max_z <- apply(num_z,1,which.max)

#How accurate is the modal classification?
table(true_z,max_z)

#How accurately do we uncover the proportion of frauds for each observation?

#Manufactured
true_man <- ((true_z == 1)*0) + ((true_z == 2)*sim_data$latent$iota.m) + 
            ((true_z == 3)*sim_data$latent$chi.m)

#What is the posterior mean proportion of manufactured votes
pred_man <- attr(samples,"frauds")$Manufactured[,1]

#Are they close?
plot(true_man, pred_man, xlab = "True Proportion Manufactured Votes", 
     ylab = "Estimated Proportion Manufactured Votes")

#Stolen
true_stolen <- ((true_z == 1)*0) + ((true_z == 2)*sim_data$latent$iota.s) + 
               ((true_z == 3)*sim_data$latent$chi.s)

#What is the posterior mean proportion of manufactured votes
pred_stolen <- attr(samples,"frauds")$Stolen[,1]

#Are they close?
plot(true_stolen, pred_stolen, xlab = "True Proportion Stolen Votes", 
     ylab = "Estimated Proportion Stolen Votes")

UMeforensics/eforensics_public documentation built on Oct. 31, 2019, 12:49 a.m.

UMeforensics/eforensics_public index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

UMeforensics/eforensics_public
Election Forensics: Positive Empirical Models of Election Fraud

eforensics: Election Forensics Finite Mixture Model
In UMeforensics/eforensics_public: Election Forensics: Positive Empirical Models of Election Fraud

Description

Usage

Arguments

Value

References

Examples

Related to eforensics in UMeforensics/eforensics_public...

R Package Documentation

Browse R Packages

We want your feedback!

UMeforensics/eforensics_public Election Forensics: Positive Empirical Models of Election Fraud

eforensics: Election Forensics Finite Mixture Model In UMeforensics/eforensics_public: Election Forensics: Positive Empirical Models of Election Fraud

Description

Usage

Arguments

Value

References

Examples

Related to eforensics in UMeforensics/eforensics_public...

R Package Documentation

Browse R Packages

We want your feedback!

UMeforensics/eforensics_public
Election Forensics: Positive Empirical Models of Election Fraud

eforensics: Election Forensics Finite Mixture Model
In UMeforensics/eforensics_public: Election Forensics: Positive Empirical Models of Election Fraud