add_random_error: Add random error flags to a data frame.

Description Usage Arguments Value Examples

View source: R/gen_gold_standard.R

Description

add_random_error adds a column of error flags (between 0 and 1) to a data frame based on the prob.

Usage

1
add_random_error(dataset, error_name, prob = c(0.95, 0.05))

Arguments

dataset

A data frame of the dataset.

error_name

A string of the name and type of the error in the form of 'error name_error type'. The error name should be one of the variable name in the dataset, and the error type can be either: 'missing', 'insert', 'variant', 'typo', 'pho', 'ocr', 'trans_date' or 'trans_char'.

prob

A vector of two numerical probablities, where the first one is the probablity of being 0 and the second one is the probablity of being 1.

Value

A data frame of the dataset with an additional column of binary encoded error.

Examples

1
2
adult_with_flag <- add_random_error(adult[1:100,], prob = c(0.97, 0.03), "age_missing")
adult_with_flag <- add_random_error(adult_with_flag, prob = c(0.65, 0.35), "education_typo")

sdglinkage documentation built on April 27, 2020, 5:09 p.m.