simulatedefect: Defect existing data.

Description Usage Arguments Value

View source: R/simulateData.R


Interprets the defect rule and defects the true dataset. Note that simulatedefect is generic and can defect any existing dataset according to rule.


simulatedefect(truedata, name, subset, prob, damage)



data.frame containing the un-defected data.


character. Specifies variable name to be defected.


formula. States a condition (e.g. ~x1 > 0.6) which specifies the fraction of observations, that are to be defected. Note, that if 'subset' does not exclusivly use the 'name[d]' variable, this implies that the independence assumption of MICE is not met (on purpose).


numeric value. Specifies the binomial probability for each observation in 'subset' to be defected.


By users defintion, it specifies what type and how the data is to be defected. 'damage' = NA generates missing data. A value between [0, 1] implies right censoring (e.g. 'damage' = 1/3), [1,...] left censoring. The value is used to multiply the true value of 'name' in order to defect the data. The generalization for fixed interval factors is 'damage' = list(1/3, 4/3), where the values specifiy the factor for the lower and the upper bound respectively. More realistic examples can be generated with vector valued 'damage': If 'damage' = c(0.1, 1) is a vector of length 2, it specifies the min and max value of a uniform distribution, from which a factor is randomly drawn for each observation with which the true data is multiplied. The generalization for random interval factors is 'damage' = list(c(0.2, 1), c(1,3)), where the first vector specifies the unif interval for factors affecting the lower bound and the second affecting the upper bound. NOTE: if a list is provided, both members must either vectors or single values.


list. List elements are the defected dataframe, an indicator vector specifying which observation was defected and an atomic description of censoring type, which the user implicitly defined by damage.

TiStat/Imputegamlss documentation built on May 20, 2019, 9:25 a.m.