RRuni: Univariate analysis of randomized response data

View source: R/RRuni.R

RRuniR Documentation

Univariate analysis of randomized response data

Description

Analyse a data vector response with a specified RR model (e.g., Warner) with known randomization probability p

Usage

RRuni(response, data, model, p, group = NULL, MLest = TRUE, Kukrep = 1)

Arguments

response

either vector of responses containing 0='no' and 1='yes' or name of response variable in data. For the Forced Response (FR) model, response values are integers from 0 to (m-1), where 'm' is the number of response categories. In Kuk's card playing method (Kuk), the observed response variable gives the number of red cards.

data

optional data.frame containing the response variable

model

defines RR model. Available models: "Warner", "UQTknown", "UQTunknown", "Mangat", "FR", "Kuk","Crosswise", "Triangular", "CDM", "CDMsym", "SLD", "mix.norm", "mix.exp","mix.unknown", or "custom". See argument p or type vignette('RRreg') for detailed specifications.

p

randomization probability (see details or vignette("RRreg"))

group

a group vector of the same length as response containing values 1 or 2, only required for two-group models, which specify different randomization probabilities for two groups, e.g., CDM or SLD. If a data.frame data is provided, the variable group is searched within it.

MLest

whether to use optim to get ML instead of moment estimates (only relevant if pi is outside of [0,1])

Kukrep

number of repetitions of Kuk's card-drawing method

Details

Each RR design model differs in the definition of the randomization probability p, which is defined as a single probability for

  • "Warner": Probability to get sensitive Question

  • "Mangat": Prob. for non-carriers to respond truthfully (i.e., with No=0)

  • "Crosswise": Probability to respond 'yes' to irrelevant second question (coding of responses: 1=['no-no' or 'yes-yes']; 0=['yes-no' or 'no-yes'])

  • "Triangular": Probability to respond 'yes' to irrelevant second question (coding of responses: 0='no' to both questions (='circle'); 1='yes' to at least one question ('triangle'))

and as a two-valued vector of probabilities for

  • "Kuk": Probability of red cards in first and second set, respectively (red=1, black=0);

  • Unrelated Question ("UQTknown"): Prob. to respond to sensitive question and known prevalence of 'yes' responses to unrelated question

  • Unrelated Question ("UQTunknown"): Prob. to respond to sensitive question in group 1 and 2, respectively

  • Cheating Detection ("CDM"): Prob. to be prompted to say yes in group 1 and 2, respectively

  • Symmetric CDM ("CDMsym"): 4-valued vector: Prob. to be prompted to say 'yes'/'no' in group 1 and 'yes'/'no' in group 2

  • Stochastic Lie Detector ("SLD"): Prob. for noncarriers to reply with 0='no' in group 1 and 2, respectively

  • Forced Response model ("FR"): m-valued vector (m=number of response categories) with the probabilities of being prompted to select response categories 0,1,..,m-1, respectively (requires sum(p)<1)

  • RR as misclassification ("custom"): a quadratic misclassification matrix is specified, where the entry p[i,j] defines the probability of responding i (i-th row) given a true state of j (j-th column)) (see getPW)

For the continuous RR models:

  • "mix.norm": 3-valued vector - Prob. to respond to sensitive question and mean and SD of the masking normal distribution of the unrelated question

  • "mix.exp": 2-valued vector - Prob. to respond to sensitive question and mean of the masking exponential distribution of the unrelated question

  • "mix.unknown": 2-valued vector - Prob. of responding to sensitive question in group 1 and 2, respectively

Value

an RRuni object, can by analyzed by using summary

See Also

vignette('RRreg') or https://www.dwheck.de/vignettes/RRreg.html for a detailed description of the RR models and the appropriate definition of p

Examples

# Generate responses of 1000 people according to Warner's model
# with an underlying true proportion of .3
df <- RRgen(n = 1000, pi = .3, model = "Warner", p = .7)
head(df)

# Analyse univariate data to estimate prevalence 'pi'
estimate <- RRuni(response = df$response, model = "Warner", p = .7)
summary(estimate)

# Generate data in line with the Stochastic Lie Detector
# assuming that 90% of the respondents answer truthfully
df2 <- RRgen(
  n = 1000, pi = .3, model = "SLD", p = c(.2, .8),
  complyRates = c(.8, 1), groupRatio = 0.4
)
estimate2 <- RRuni(
  response = df2$response, model = "SLD",
  p = c(.2, .8), group = df2$group
)
summary(estimate2)


danheck/RRreg documentation built on Dec. 3, 2022, 7:50 p.m.