POEM | R Documentation |
POEM takes into account missing values, outlier indicators, error indicators and sampling weights.
POEM(
data,
weights,
outind,
errors,
missing.matrix,
alpha = 0.5,
beta = 0.5,
reweight.out = FALSE,
c = 5,
preliminary.mean.imputation = FALSE,
monitor = FALSE
)
data |
a data frame or matrix with the data. |
weights |
sampling weights. |
outind |
an indicator vector for the outliers with |
errors |
matrix of indicators for items which failed edits. |
missing.matrix |
the missingness matrix can be given as input. Otherwise, it will be recalculated. |
alpha |
scalar giving the weight attributed to an item that is failing. |
beta |
minimal overlap to accept a donor. |
reweight.out |
if |
c |
tuning constant when redefining the outliers (cutoff for Mahalanobis distance). |
preliminary.mean.imputation |
assume the problematic observation is at the mean of good observations. |
monitor |
if |
POEM
assumes that an multivariate outlier detection has been carried out
beforehand and assumes the result is summarized in the vector outind
.
In addition, further observations may have been flagged as failing edit-rules
and this information is given in the vector errors
. The mean and
covariance estimate is calculated with the good observations (no outliers and
downweighted errors). Preliminary mean imputation is sometimes needed to avoid
a non-positive definite covariance estimate at this stage. Preliminary mean
imputation assumes that the problematic values of an observation (with errors,
outliers or missing) can be replaced by the mean of the rest of the non-problematic
observations. Note that the algorithm imputes these problematic observations
afterwards and therefore the final covariance matrix with imputed data is not
the same as the working covariance matrix (which may be based on preliminary mean
imputation).
POEM
returns a list whose first component output
is a
sub-list with the following components:
preliminary.mean.imputation
Logical. TRUE
if preliminary
mean imputation should be used
completely.missing
Number of observations with no observed values
good.values
Weighted number of of good values (not missing, not outlying, not erroneous)
nonoutliers.before
Number of nonoutliers before reweighting
weighted.nonoutliers.before
Weighted number of nonoutliers before reweighting
nonoutliers.after
Number of nonoutliers after reweighting
weighted.nonoutliers.after
Weighted number of nonoutliers after reweighting
old.center
Coordinate means after weighting, before imputation
old.variances
Coordinate variances after weighting, before imputation
new.center
Coordinate means after weighting, after imputation
new.variances
Coordinate variances after weighting, after imputation
covariance
Covariance (of standardised observations) before imputation
imputed.observations
Indices of observations with imputed values
donors
Indices of donors for imputed observations
new.outind
Indices of new outliers
The further component returned by POEM
is:
imputed.data
Imputed data set
Beat Hulliger
Béguin, C. and Hulliger B., (2002), EUREDIT Workpackage x.2 D4-5.2.1-2.C Develop and evaluate new methods for statistical outlier detection and outlier robust multivariate imputation, Technical report, EUREDIT 2002.
data(bushfirem, bushfire.weights)
outliers <- rep(0, nrow(bushfirem))
outliers[31:38] <- 1
imp.res <- POEM(bushfirem, bushfire.weights, outliers,
preliminary.mean.imputation = TRUE)
print(imp.res$output)
var(imp.res$imputed.data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.