knitr::opts_chunk$set( collapse = TRUE, out.width = "100%" )
fastMCD
performs robust mean and covariance estimation using the FAST-MCD algorithm described by Rousseeuw and van Driessen (1999).
The minimum covariance determinant (MCD) method obtains robust estimates of location and scale for multivariate data by identifying a subset of h observations with the lowest determinant in its covariance matrix. This method is highly robust and accurate but is infeasible for large datasets. The FAST-MCD algorithm, described by Rousseeuw and van Driessen (1999), approximates the MCD estimate with an iterative procedure applied to a set of initial subsamples. FAST-MCD is far faster than MCD and typically is accurate even for large datasets. Superior implementations exist in R already, however the fastMCD
package represents my attempt to independently implement FAST-MCD and understand the underlying algorithm.
devtools::install_github("frankp-0/fastMCD", build_vignettes = T)
Note that installation may take > 1 minute if vignettes are built.
The following examples demonstrate the usage of fastMCD
for small (350 observations) and large (5000 observations) datasets.
smallX <- MASS::mvrnorm(n = 350, mu = rep(0, 3), Sigma = diag(rep(1, 3))) fastMCD(smallX)
bigX <- MASS::mvrnorm(n = 5000, mu = rep(0, 3), Sigma = diag(rep(1, 3))) fastMCD(bigX)
Users may optionally specify their own argument for h = # of observations to subsample.
X <- MASS::mvrnorm(n = 350, mu = rep(0, 3), Sigma = diag(rep(1, 3))) fastMCD(bigX, h = 500)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.