perturb_dataset: Perturbate a boolean matrix

perturb_datasetR Documentation

Perturbate a boolean matrix

Description

Given a boolean matrix, randomly add False Positives (FP), False Negatives (FN) and Missing data following user defined rates. In the final matrix, missing data is represented by the value 3.

Usage

perturb_dataset(dataset, FP_rate = 0, FN_rate = 0, MIS_rate = 0)

Arguments

dataset

a matrix/sparse matrix

FP_rate

False Positive rate

FN_rate

False Negative rate

MIS_rate

Missing Data rate

Details

Note that CIMICE does not support dataset with missing data natively, so using MIS_rate != 0 will then require some pre-processing.

Value

the new, perturbed, matrix

Examples

require(dplyr)

example_dataset() %>%
  make_generator_stub() %>% 
  set_generator_edges(
    list(
      "D", "A, D", 1 , 
      "A", "A, D", 1 , 
      "A, D", "A, C, D", 1 , 
      "A, D", "A, B, D", 1 , 
      "Clonal", "D", 1 , 
      "Clonal", "A", 1 , 
      "D", "D", 1 , 
      "A", "A", 1 , 
      "A, D", "A, D", 1 , 
      "A, C, D", "A, C, D", 1 , 
      "A, B, D", "A, B, D", 1 , 
      "Clonal", "Clonal", 1 
  )) %>% 
  finalize_generator %>% 
  simulate_generator(3, 10) %>% 
  perturb_dataset(FP_rate = 0.01, FN_rate = 0.1, MIS_rate = 0.12)


redsnic/CIMICE documentation built on March 30, 2022, 2:46 a.m.