ef_simulateData: Simulate Election data

Description Usage Arguments Value

View source: R/ef_simulate_data.R

Description

This function simulates data sets from the Election Forensics Finite Mixture models.

Usage

1
2
ef_simulateData(n = 2000, nCov = 0, nCov.fraud = 0, model,
  pi = NULL, overdispersion = 10)

Arguments

n

integer, the number of sample points representing observations of election units (e.g., precinct, ballot boxes)

nCov

number of covariates affecting both turnout and votes for the winner in each election unit

nCov.fraud

number of covariates affecting the occurrence of frauds (either incremental or extreme) in both turnout and votes for the winner

model

string which dictates the model from which to simulate data. Select from one of the following models:

bl

Simulate data from the binomial-logistic model of election frauds. Data returned are the

bl_none

Simulate data from the binomial-logistic model of election frauds with no covariates on the frauds.

rn_no_alpha

Simulate data from the restricted normal model with no alpha.

rn

Simulate data from the restricted normal model.

rn_wcounts

Simulate data from the restricted normal model of election frauds and return counts instead of proportions.

bbl

Simulate data from the beta-binomial-logistc model of election frauds. Simulates data from randomly overdispersed counts. Overdispersion is a value greater than zero that dictates the amount of overdispersion. Smaller values for overdispersion lead to larger amounts of random dispersion within the simulated data.

pi

vector with three numbers between 0 and 1 whose sum must add up to 1. If NULL (default) this vector will be randomly generated

overdispersion

numeric, degree of overdispersion of the distributions. Higher values are less overdispersed.

Value

The function returns a list with a data frame with the simulated data and a sublist with the parameters used to generate the data:

parameters

A named vector with the true values used to simulate the data. Values include the number of observations, values for the mixing parameters, and true values for the intercepts and coefficients associated with the covariates for each of the six no fraud and fraud compoenents of the model.

latent

A named list with the true values of latent quantities used to generate the data. These are simulated at the observation level. These values include:

data

A data.frame that includes the simulated data under the defined arguments. This data.frame may include:


UMeforensics/eforensics_public documentation built on Oct. 31, 2019, 12:49 a.m.