pMST: The pMST Algorithm

Description Usage Arguments Details Value Author(s) References Examples

Description

The function determines a robust subsample and computes estimates of location and scatter on the subset.

Usage

1
pMST(data, N = floor((nrow(data) + ncol(data) + 1)/2), lmax = nrow(data) * 100)

Arguments

data

data set to be analyzed, at least a 2-dimensional matrix whose number of rows (i.e. observations n) is greater than the number of columns (i.e. dimension d).

N

Size of the (robust) subsample to be determined. Default is (n+d+1)/2.

lmax

Numerical option: determines the maximal number of pruning steps, see deteils.

Details

The function uses the minimum.spanning.tree function from the igraph-package to determine the minimum spanning tree (MST) of the data. The resulting MST is iteratively pruned by deleting edges (starting with the longest edge in the MST) until a connected subset with sufficient size (N) remains. Based on the robust subsample, location and scatter are estimated.

Value

loc

Location estimate based on the robust subsample.

cov

Covariance estimate based on the robust subsample.

sample

Index of the observations in the robust subsample.

data

The input data set.

Author(s)

Thomas Kirschstein <thomas.kirschstein@wiwi.uni-halle.de>

References

Kirschstein, T., Liebscher, S., and Becker, C. (2013): Robust estimation of location and scatter by pruning the minimum spanning tree, Journal of Multivariate Analysis, 120, 173-184, DOI: 10.1016/j.jmva.2013.05.004.

Liebscher, S., Kirschstein, T. (2015): Efficiency of the pMST and RDELA Location and Scatter Estimators, AStA-Advances in Statistical Analysis, 99(1), 63-82, DOI: 10.1007/s10182-014-0231-7.

Examples

1
2
3
4
# Determine subsample of minimal size
# sub <- pMST(halle)
# Determine subsample of size=900 
# extsub <- pMST(halle, N=900)

restlos documentation built on May 2, 2019, 2:45 p.m.