simulatemissing: Artifical simulation of various kinds of missings/polluted...
In compositions: Compositional Data Analysis

simulatemissings

R Documentation

Artifical simulation of various kinds of missings/polluted data

Description

These are simulation mechanisms to check that missing techniques perform in sensible ways. They just generate additional missings of the various types in a given dataset, according to a specific process.

Usage

simulateMissings(x, dl=NULL, knownlimit=FALSE,
     MARprob=0.0, MNARprob=0.0, mnarity=0.5, SZprob=0.0)
observeWithAdditiveError(x, sigma=dl/dlf, dl=sigma*dlf, dlf=3,
     keepObs=FALSE, digits=NA, obsScale=1,
     class="acomp")

Arguments

`x`	a dataset that should get the missings
`dl`	the detection limit described in `clo`, to impose an artificial detection limit
`knownlimit`	a boolean indicating wether the actual detection limit is still known in the dataset.
`MARprob`	the probability of occurence of 'Missings At Random' values
`MNARprob`	the probability of occurrence of 'Missings Not At Random'. The tendency is that small values have a higher probability to be missed.
`mnarity`	a number between 0 and 1 giving the strength of the influence of the actual value in becoming a MNAR. 0 means a MAR like behavior and 1 means that it is just the smallest values that is lost
`SZprob`	the probability to obtain a structural zero. This is done at random like a MAR.
`sigma`	the standard deviation of the normal distributed extra additive error
`dlf`	the distance from 0 at which a datum will be considered BDL
`keepObs`	should the (closed) data without additive error be returned as an attribute?
`digits`	rounding to be applied to the data with additive error (see Details)
`obsScale`	rounding to be applied to the data with additive error (see Details). Should be a power of 10.
`class`	class of the output object

Details

Without any additional parameters no missings are generated. The procedure to generate MNAR affects all variables.

Function "simulateMissings" is a multipurpose simulator, where each class of missing value is treated separately, and where detection limits are specified as thresholds.

Function "observeWithAdditiveError" simulates data within a very specific framework, where an additive error of sd=sigma is added to the input data x, and BDLs are generated if a datum is less than dfl times sigma. Afterwards, the resulting data are rounded as round(data/obsScale,digits)*obsScale, i.e. a certain observation scale obsScale is chosen, and at that scale, only some digits are kept. This framework is typical of chemical analyses, and it generates both BDLs and pollution/rounding of (apparently) "right" data.

Value

A dataset like x but with some additional missings.

Author(s)

K.Gerald van den Boogaart

References

van den Boogaart, K., R. Tolosana-Delgado, and M. Bren (2011). The Compositional Meaning of a Detection Limit. In Proceedings of the 4th International Workshop on Compositional Data Analysis (2011).

van den Boogaart, K.G., R. Tolosana-Delgado and M. Templ (2014) Regression with compositional response having unobserved components or below detection limit values. Statistical Modelling (in press).

See compositions.missings for more details.

Examples

data(SimulatedAmounts)
x <- acomp(sa.lognormals)
xnew <- simulateMissings(x,dl=0.05,MAR=0.05,MNAR=0.05,SZ=0.05)
acomp(xnew)
plot(missingSummary(xnew))

compositions documentation built on June 22, 2024, 12:15 p.m.