ef_simulateData: Simulate Election data
In UMeforensics/eforensics_public: Election Forensics: Positive Empirical Models of Election Fraud

This function simulates data sets from the Election Forensics Finite Mixture models.

1 2	ef_simulateData(n = 2000, nCov = 0, nCov.fraud = 0, model, pi = NULL, overdispersion = 10)

`n`	integer, the number of sample points representing observations of election units (e.g., precinct, ballot boxes)
`nCov`	number of covariates affecting both turnout and votes for the winner in each election unit
`nCov.fraud`	number of covariates affecting the occurrence of frauds (either incremental or extreme) in both turnout and votes for the winner
`model`	string which dictates the model from which to simulate data. Select from one of the following models: bl Simulate data from the binomial-logistic model of election frauds. Data returned are the bl_none Simulate data from the binomial-logistic model of election frauds with no covariates on the frauds. rn_no_alpha Simulate data from the restricted normal model with no alpha. rn Simulate data from the restricted normal model. rn_wcounts Simulate data from the restricted normal model of election frauds and return counts instead of proportions. bbl Simulate data from the beta-binomial-logistc model of election frauds. Simulates data from randomly overdispersed counts. Overdispersion is a value greater than zero that dictates the amount of overdispersion. Smaller values for `overdispersion` lead to larger amounts of random dispersion within the simulated data.
`pi`	vector with three numbers between 0 and 1 whose sum must add up to 1. If `NULL` (default) this vector will be randomly generated
`overdispersion`	numeric, degree of overdispersion of the distributions. Higher values are less overdispersed.

The function returns a list with a data frame with the simulated data and a sublist with the parameters used to generate the data:

parameters: A named vector with the true values used to simulate the data. Values include the number of observations, values for the mixing parameters, and true values for the intercepts and coefficients associated with the covariates for each of the six no fraud and fraud compoenents of the model.
latent: A named list with the true values of latent quantities used to generate the data. These are simulated at the observation level. These values include:

z The true classification for each observation. 1 is a non-fraudulent observation. 2 is an incrementally fraudulent observation. And 3 is an extremely fraudulent observation.
tau The proportion of turnout for each observation. This is associated with non-fraudulent outcomes.
nu The proportion of votes for the winner for each observation. This is associated with non-fraudulent outcomes.
iota.m Conditional on being classified as incrementally fraudulent, the proportion of fraudulent turnout.
iota.m Conditional on being classified as incrementally fraudulent, the proportion of fraudulent votes for the winner.
chi.m Conditional on being classified as extremely fraudulent, the proportion of fraudulent turnout.
chi.s Conditional on being classified as extremely fraudulent, the proportion of fraudulent votes for the winner.

data: A data.frame that includes the simulated data under the defined arguments. This data.frame may include:

w The number or proportion of votes for the winner at each observation.
a The number or proportion of abstentions at each observation.
x Covariates. Each covariate is indexed by the parameter it is associated with in the model and the number. For example, x1.w is the first covariate associated with the proportion of non-fraudulent votes for the winner.
N If data is generated by a count model, this is the number of eligible voters at each observation.

UMeforensics/eforensics_public documentation built on Oct. 31, 2019, 12:49 a.m.

UMeforensics/eforensics_public index

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Description