Description Usage Arguments Value Examples
View source: R/gen_gold_standard.R
add_dependent_error
adds two column of dependent error flags (between 0 and 1)
to a data frame.
1 2 3 4 5 6 | add_dependent_error(
dataset,
error_names,
prior_probs = c(0.5, 0.5),
cond_probs = c(0.95, 0.05, 0.85, 0.15)
)
|
dataset |
A data frame of the dataset. |
error_names |
A string of the variable names and type of the error in the form of 'variable 1_variable 2_error type'. The error of variable 2 depends on the error of varable 1. The error type can be either: 'missing', 'insert', 'variant', 'typo', 'pho', 'ocr', 'trans_date' or 'trans_char'. |
prior_probs |
A vector of two numerical probablities, where the first one is the prior probablity of variable 1 being 0 (no error) and the second one is the prior probablity of variable 1 being 1 (having error). |
cond_probs |
A vector of four numerical probablities, where the first two probablities are the probablities of variable 2 being 0 and 1 given variable 1 being 0, and the last two are the probablities of variable 2 being 0 and 1 given variable 1 being 1. |
A data frame of the dataset
with two additional dependent column of binary
encoded error.
1 2 3 4 | adult_with_flag <- add_dependent_error(adult[1:100,], "race_sex_typo")
adult_with_flag <- add_dependent_error(adult[1:100,], "age_sex_missing",
prior_probs = c(0.99, 0.01),
cond_probs = c(0.95, 0.05, 0.4, 0.6))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.