draw_discrete: Draw discrete variables including binary, binomial count,... In DeclareDesign/fabricatr: Imagine Your Data Before You Collect It

Description

Drawing discrete data based on probabilities or latent traits is a common task that can be cumbersome. Each function in our discrete drawing set creates a different type of discrete data: `draw_binary` creates binary 0/1 data, `draw_binomial` creates binomial data (repeated trial binary data), `draw_categorical` creates categorical data, `draw_ordered` transforms latent data into observed ordered categories, `draw_count` creates count data (poisson-distributed). `draw_likert` is an alias to `draw_ordered` that pre-specifies break labels and offers default breaks appropriate for a likert survey question.

Usage

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20``` ```draw_binomial(prob = link(latent), trials = 1, N = length(prob), latent = NULL, link = "identity", quantile_y = NULL) draw_categorical(prob = link(latent), N = NULL, latent = NULL, link = "identity", category_labels = NULL) draw_ordered(x = link(latent), breaks = c(-1, 0, 1), break_labels = NULL, N = length(x), latent = NULL, strict = FALSE, link = "identity") draw_count(mean = link(latent), N = length(mean), latent = NULL, link = "identity", quantile_y = NULL) draw_binary(prob = link(latent), N = length(prob), link = "identity", latent = NULL, quantile_y = NULL) draw_likert(x, type = 7, breaks = NULL, N = length(x), latent = NULL, link = "identity", strict = !is.null(breaks)) draw_quantile(type, N) ```

Arguments

 `prob` A number or vector of numbers representing the probability for binary or binomial outcomes; or a number, vector, or matrix of numbers representing probabilities for categorical outcomes. If you supply a link function, these underlying probabilities will be transformed. `trials` for `draw_binomial`, the number of trials for each observation `N` number of units to draw. Defaults to the length of the vector of probabilities or latent data you provided. `latent` If the user provides a link argument other than identity, they should provide the variable `latent` rather than `prob` or `mean` `link` link function between the latent variable and the probability of a positive outcome, e.g. "logit", "probit", or "identity". For the "identity" link, the latent variable must be a probability. `quantile_y` A vector of quantiles; if provided, rather than drawing stochastically from the distribution of interest, data will be drawn at exactly those quantiles. `category_labels` vector of labels for the categories produced by `draw_categorical`. If provided, must be equal to the number of categories provided in the `prob` argument. `x` for `draw_ordered` or `draw_likert`, the latent data for each observation. `breaks` vector of breaks to cut a latent outcome into ordered categories with `draw_ordered` or `draw_likert` `break_labels` vector of labels for the breaks to cut a latent outcome into ordered categories with `draw_ordered`. (Optional) `strict` Logical indicating whether values outside the provided breaks should be coded as NA. Defaults to `FALSE`, in which case effectively additional breaks are added between -Inf and the lowest break and between the highest break and Inf. `mean` for `draw_count`, the mean number of count units for each observation `type` Type of Likert scale data for `draw_likert`. Valid options are 4, 5, and 7. Type corresponds to the number of categories in the Likert scale.

Details

For variables with intra-cluster correlations, see `draw_binary_icc` and `draw_normal_icc`

Value

A vector of data in accordance with the specification; generally numeric but for some functions, including `draw_ordered`, may be factor if break labels are provided.

Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45``` ```# Drawing binary values (success or failure, treatment assignment) fabricate(N = 3, p = c(0, .5, 1), binary = draw_binary(prob = p)) # Drawing binary values with probit link (transforming continuous data # into a probability range). fabricate(N = 3, x = 10 * rnorm(N), binary = draw_binary(latent = x, link = "probit")) # Repeated trials: `draw_binomial` fabricate(N = 3, p = c(0, .5, 1), binomial = draw_binomial(prob = p, trials = 10)) # Ordered data: transforming latent data into observed, ordinal data. # useful for survey responses. fabricate(N = 3, x = 5 * rnorm(N), ordered = draw_ordered(x = x, breaks = c(-Inf, -1, 1, Inf))) # Providing break labels for latent data. fabricate(N = 3, x = 5 * rnorm(N), ordered = draw_ordered(x = x, breaks = c(-Inf, -1, 1, Inf), break_labels = c("Not at all concerned", "Somewhat concerned", "Very concerned"))) # Likert data: often used for survey data fabricate(N = 10, support_free_college = draw_likert(x = rnorm(N), type = 5)) # Count data: useful for rates of occurrences over time. fabricate(N = 5, x = c(0, 5, 25, 50, 100), theft_rate = draw_count(mean=x)) # Categorical data: useful for demographic data. fabricate(N = 6, p1 = runif(N), p2 = runif(N), p3 = runif(N), cat = draw_categorical(cbind(p1, p2, p3))) ```

DeclareDesign/fabricatr documentation built on May 6, 2019, 1:57 p.m.