simulatedefect: Defect existing data.

Description Usage Arguments Value

View source: R/simulateData.R

Description

Interprets the defect rule and defects the true dataset. Note that simulatedefect is generic and can defect any existing dataset according to rule.

Usage

1
simulatedefect(truedata, name, subset, prob, damage)

Arguments

truedata

data.frame containing the un-defected data.

name

character. Specifies variable name to be defected.

subset

formula. States a condition (e.g. ~x1 > 0.6) which specifies the fraction of observations, that are to be defected. Note, that if 'subset' does not exclusivly use the 'name[d]' variable, this implies that the independence assumption of MICE is not met (on purpose).

prob

numeric value. Specifies the binomial probability for each observation in 'subset' to be defected.

damage

By users defintion, it specifies what type and how the data is to be defected. 'damage' = NA generates missing data. A value between [0, 1] implies right censoring (e.g. 'damage' = 1/3), [1,...] left censoring. The value is used to multiply the true value of 'name' in order to defect the data. The generalization for fixed interval factors is 'damage' = list(1/3, 4/3), where the values specifiy the factor for the lower and the upper bound respectively. More realistic examples can be generated with vector valued 'damage': If 'damage' = c(0.1, 1) is a vector of length 2, it specifies the min and max value of a uniform distribution, from which a factor is randomly drawn for each observation with which the true data is multiplied. The generalization for random interval factors is 'damage' = list(c(0.2, 1), c(1,3)), where the first vector specifies the unif interval for factors affecting the lower bound and the second affecting the upper bound. NOTE: if a list is provided, both members must either vectors or single values.

Value

list. List elements are the defected dataframe, an indicator vector specifying which observation was defected and an atomic description of censoring type, which the user implicitly defined by damage.


TiStat/Imputegamlss documentation built on May 20, 2019, 9:25 a.m.