RRcor: Bivariate correlations including randomized response...
In danheck/RRreg: Correlation and Regression Analyses for Randomized Response Data

RRcor

R Documentation

Bivariate correlations including randomized response variables

Description

RRcor calculates bivariate Pearson correlations of variables measured with or without RR.

Usage

RRcor(
  x,
  y = NULL,
  models,
  p.list,
  group = NULL,
  bs.n = 0,
  bs.type = c("se.n", "se.p", "pval"),
  nCPU = 1
)

Arguments

`x`	a numeric vector, matrix or data frame.
`y`	`NULL` (default) or a vector, matrix or data frame with compatible dimensions to `x`.
`models`	a vector defining which RR design is used for each variable. Must be in the same order as variables appear in `x` and `y` (by columns). Available discrete models: `Warner`, `Kuk`, `FR`, `Mangat`, `UQTknown`, `UQTunknown`, `Crosswise`, `Triangular`, `SLD` and `direct` (i.e., no randomized response design). Available continuous models: `mix.norm`, `mix.exp`.
`p.list`	a `list` containing the randomization probabilities of the RR models defined in `models`. Either, all `direct`-variables (i.e., no randomized response) in `models` can be excluded in `p.list`; or, if specified, randomization probabilities `p` are ignored for `direct`-variables. See `RRuni` for a detailed specification of p.
`group`	a matrix defining the group membership of each participant (values 1 and 2) for all multiple group models(`SLD`, `UQTunknown`). If only one of these models is included in `models`, a vector can be used. For more than one model, each column should contain one grouping variable
`bs.n`	number of samples used to get bootstrapped standard errors
`bs.type`	to get boostrapped standard errors, use `"se.p"` for the parametric and/or `"se.n"` for the nonparametric bootstrap. Use `"pval"` to get p-values from the parametric bootstrap (assuming a true correlation of zero). Note that `bs.n` has to be larger than 0. The parametric bootstrap is based on the assumption, that the continuous variable is normally distributed within groups defined by the true state of the RR variable. For polytomous forced response (FR) designs, the RR variable is assumed to have equally spaced distances between categories (i.e., that it is interval scaled)
`nCPU`	only relevant for the bootstrap: either the number of CPU cores or a cluster initialized via `makeCluster`.

Details

Correlations of RR variables are calculated by the method of Fox & Tracy (1984) by interpreting the variance induced by the RR procedure as uncorrelated measurement error. Since the error is independent, the correlation can be corrected to obtain an unbiased estimator.

Note that the continuous RR model mix.norm with the randomization parameter p=c(p.truth, mean, SD) assumes that participants respond either to the sensitive question with probability p.truth or otherwise to a known masking distribution with known mean and SD. The estimated correlation only depends on the mean and SD and does not require normality. However, the assumption of normality is used in the parametric bootstrap to obtain standard errors.

Value

RRcor returns a list with the following components::

r estimated correlation matrix

rSE.p, rSE.n standard errors from parametric/nonparametric bootstrap

prob two-sided p-values from parametric bootstrap

samples.p, samples.n sampled correlations from parametric/nonparametric bootstrap (for the standard errors)

References

Fox, J. A., & Tracy, P. E. (1984). Measuring associations with randomized response. Social Science Research, 13, 188-197.

Examples

# generate first RR variable
n <- 1000
p1 <- c(.3, .7)
gData <- RRgen(n, pi = .3, model = "Kuk", p1)

# generate second RR variable
p2 <- c(.8, .5)
t2 <- rbinom(n = n, size = 1, prob = (gData$true + 1) / 2)
temp <- RRgen(model = "UQTknown", p = p2, trueState = t2)
gData$UQTresp <- temp$response
gData$UQTtrue <- temp$true

# generate continuous covariate
gData$cov <- rnorm(n, 0, 4) + gData$UQTtrue + gData$true

# estimate correlations using directly measured / RR variables
cor(gData[, c("true", "cov", "UQTtrue")])
RRcor(
  x = gData[, c("response", "cov", "UQTresp")],
  models = c("Kuk", "d", "UQTknown"), p.list = list(p1, p2)
)

danheck/RRreg documentation built on Dec. 3, 2022, 7:50 p.m.