RRgen: Generate randomized response data
In danheck/RRreg: Correlation and Regression Analyses for Randomized Response Data

RRgen

R Documentation

Generate randomized response data

Description

The method RRgen generates data according to a specified RR model, e.g., "Warner". True states are either provided by a vector trueState or drawn randomly from a Bernoulli distribution. Useful for simulation and testing purposes, e.g., power analysis.

Usage

RRgen(
  n,
  pi.true,
  model,
  p,
  complyRates = c(1, 1),
  sysBias = c(0, 0),
  groupRatio = 0.5,
  Kukrep = 1,
  trueState = NULL
)

Arguments

`n`	sample size of generated data
`pi.true`	true proportion in population (a vector for m-categorical `"FR"` or `"custom"` model)
`model`	specifes the RR model, one of: `"Warner"`, `"UQTknown"`, `"UQTunknown"`, `"Mangat"`, `"Kuk"`, `"FR"`, `"Crosswise"`, `"Triangular"`, `"CDM"`, `"CDMsym"`, `"SLD"`, `"mix.norm"`, `"mix.exp"`, `"custom"`. See `vignette("RRreg")` for details.
`p`	randomization probability (depending on model, see `RRuni` for details)
`complyRates`	vector with two values giving the proportions of carriers and non-carriers who adhere to the instructions, respectively
`sysBias`	probability of responding 'yes' (coded as 1) in case of non-compliance for carriers and non-carriers of the sensitive attribute, respectively. If `sysBias=c(0,0)`, carriers and non-carriers systematically give the nonsensitive response 'no' (also known as self-protective(SP)-'no' responses). If `sysBias=c(0,0.5)`, carriers always respond 'no' whereas non-carriers randomly select a response category. Note that `sysBias = c(0.5,0.5)` might be the best choice for `Kuk` and `Crosswise`. For the m-categorical `"FR"` or `"custom"` model, `sysBias` can be given as a probability vector for categories 0 to (m-1).
`groupRatio`	proportion of participants in group 1. Only required for two-group models, e.g., `SLD` and `CDM`
`Kukrep`	Number of repetitions of Kuk's procedure (how often red and black cards are drawn)
`trueState`	optional vector containing true states of participants (i.e., 1 for carriers and 0 for noncarriers of sensitive attribute; for `FR`: values from 0,1,...,M-1 (M = number of response categories) which will be randomized according to the defined procedure (if specified, `n` and `pi.true` are ignored)

Details

If trueState is specified, the randomized response procedure will be simulated for this vector, otherwise a random vector of length n with true proportion pi.true is drawn. Respondents answer biases can be simulated by adjusting the compliance rates: if complyRates is set to c(1,1), all respondents adhere to the randomization procedure. If one or both rates are smaller than 1, sysBias determines whether noncompliant respondents systematically choose the nonsensitive category or whether they answer randomly.

SLD - to generate data according to the stochastic lie detector with the proportion t of honest carriers, parameters are set to complyRates=c(t,1) and sysBias=c(0,0)

CDM - to generate data according to the cheating detection model with the proportion gamma of cheaters, parameters are set to complyRates=c(1-gamma,1-gamma) and sysBias=c(0,0)

Value

data.frame including the variables true and response (and for SLD and CDM a third variable group)

Examples

# Generate responses of 1000 people according to Warner's model,
# every participant complies to the RR procedure
genData <- RRgen(n = 1000, pi.true = .3, model = "Warner", p = .7)
colMeans(genData)

# use Kuk's model with two decks of cards,
# p gives the proportions of red cards for carriers/noncarriers
genData <- RRgen(n = 1000, pi.true = .4, model = "Kuk", p = c(.4, .7))
colMeans(genData)

# Stochastic Lie Detector (SLD):
# Only 80% of carriers answer according to the RR procedure
genData <- RRgen(
  n = 1000, pi.true = .2, model = "SLD", p = c(.2, .8),
  complyRates = c(.8, 1), sysBias = c(0, 0)
)
colMeans(genData)

danheck/RRreg documentation built on Dec. 3, 2022, 7:50 p.m.