# EAimp: Epidemic Algorithm for imputation of multivariate outliers in incomplete survey data

## Description

After running `EAdet` an imputation of the detected outliers with `EAimp` may be run.

## Usage

 ```1 2 3 4 5 6 7``` ```EAimp(data, weights , outind, reach="max", transmission.function = "root", power=ncol(data), distance.type = "euclidean", duration = 5, maxl = 5, kdon = 1, monitor = FALSE, threshold = FALSE, deterministic = TRUE, fixedprop = 0) ```

## Arguments

 `data` a data frame or matrix with the data `weights` a vector of positive sampling weights `outind` a logical vecotr with component TRUE for outliers `reach` reach of the threshold function (usually set to the maximum distance to a nearest neighbour, see internal function `.EA.dist`) `transmission.function` form of the transmission function of distance `d`: `"step"` is a heaviside function which jumps to `1` at `d0`, `"linear"` is linear between `0` and `d0`, `"power"` is `(beta*d+1)^(-p)` for `p=ncol(data)` as default, `"root"` is the function `1-(1-d/d0)^(1/maxl)` `power` sets `p=power`, where `p` is the parameter in the above transmission function. `distance.type` distance type in function `dist()` `maxl` Maximum number of steps without infection `monitor` if `TRUE` verbose output on epidemic `threshold` Infect all remaining points with infection probability above the threshold `1-0.5^(1/maxl)` `deterministic` if `TRUE` the number of infections is the expected number and the infected observations are the ones with largest infection probabilities. `duration` The duration of the detection epidemic `kdon` The number of donors that should be infected before imputation `fixedprop` If `TRUE` a fixed proportion of observations is infected at each step

## Details

`EAimp` uses the distances calculated in `EAdet` (actually the counterprobabilities, which are stored in a global data set) and starts an epidemic at each observation to be imputed until donors for the missing values are infected. Then a donor is selected randomly.

## Value

`EAimp` returns a list with components `parameters` and `imputed.data`.

`parameters` contains the following components:

 `sample.size` Number of observations `number.of.variables` Number of variables `n.complete.records` Number of records without missing values `n.usable.records` Number of records with less than half of values missing (unusable observations are discarded) `duration` Duration of epidemic `reach` Transmission distance (d0) `threshold` Input parameter `deterministic` Input parameter `computation.time` Elapsed computation time

`imputed.data` contains the imputed data.

Beat Hulliger

## References

B\'eguin, C., and Hulliger, B. (2004). Multivariate oulier detection in incomplete survey data: The epidemic algorithm and transformed rank correlations. Journal of the Royal Statistical Society, A 167(Part 2.), 275-294.

`EAdet` for outlier detection with the Epicemic Algorithm.
 ```1 2 3 4 5``` ```data(bushfirem,bushfire.weights) det.res<-EAdet(bushfirem,bushfire.weights) imp.res<-EAimp(bushfirem,bushfire.weights,outind=det.res\$outind, reach=det.res\$output\$max.min.di,kdon=3) print(imp.res\$output) ```