Description Usage Arguments Details Value Author(s) References See Also Examples
In EAdet
an epidemic is started at a center of the data.
The epidemic spreads out and infects neighbouring points (probabilistically or deterministically).
The last points infected are outliers. After running EAdet
an imputation with EAimp
may be run.
1 2 3 4 5 6 7 8 |
data |
a data frame or matrix with the data |
weights |
a vector of positive sampling weights |
reach |
if |
transmission.function |
form of the transmission function of distance |
power |
sets |
distance.type |
distance type in function |
maxl |
Maximum number of steps without infection |
plotting |
if |
monitor |
if |
prob.quantile |
If mads fail take this quantile absolute deviation |
random.start |
If |
fix.start |
Force epidemic to start at a specific observation |
threshold |
Infect all remaining points with infection probability above the threshold |
deterministic |
if |
rm.missobs |
Set |
verbose |
More output with |
The form and parameters of the transmission function should be chosen such that the infection times have at least a range of 10. The default cutting point to decide on outliers is the median infection time plus three times the mad of infection times. A better cutpoint may be chosen by visual inspection of the cdf of infection times.
EAdet
calls the function EA.dist
, which passes the counterprobabilities of infection (an n*(n-1)/2 size vector!) and three parameters (sample spatial median index, maximal distance to nearest neighbor and transmission distance=reach) as arguments to EA.det
. The distances vector may be too large to be passed as arguments. Then either the memory size must be increased. Former versions of the code used a global variable to store the distances in order to save memory.
EAdet
returns a list whose first component output
is a sub-list with the following components:
sample.size |
Number of observations |
discarded.observations |
Indices of discarded observations |
missing.observations |
Indices of completely missing observations |
number.of.variables |
Number of variables |
n.complete.records |
Number of records without missing values |
n.usable.records |
Number of records with less than half of values missing (unusable observations are discarded) |
medians |
Component wise medians |
mads |
Component wise mads |
prob.quantile |
Use this quantile if mads fail, i.e. if one of the mads is 0. |
quantile.deviations |
Quantile of absolute deviations. |
start |
Starting observation |
transmission.function |
Input parameter |
power |
Input parameter |
maxl |
Maximum number of steps without infection |
min.nn.dist |
maximal nearest neighbor distance |
transmission.distance |
|
threshold |
Input parameter |
distance.type |
Input parameter |
deterministic |
Input parameter |
number.infected |
Number of infected observations |
cutpoint |
Cutpoint of infection times for outlier definition |
number.outliers |
Number of outliers |
outliers |
Indices of outliers |
duration |
Duration of epidemic |
computation.time |
Elapsed computation time |
initialisation.computation.time |
Elapsed compuation time for standardisation and calculation of distance matrix |
The further components returned by EAdet
are:
infected |
Indicator of infection |
infection.time |
Time of infection |
outind |
Indicator of outliers |
Beat Hulliger
B\'eguin, C., and Hulliger, B. (2004). Multivariate oulier detection in incomplete survey data: The epidemic algorithm and transformed rank correlations. Journal of the Royal Statistical Society, A 167(Part 2.), 275-294.
EAimp
for imputation with the Epidemic Algorithm.
1 2 3 | data(bushfirem,bushfire.weights)
det.res<-EAdet(bushfirem,bushfire.weights)
print(det.res$output)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.