scordis: Score distances (SD) in a PCA or PLS score space

View source: R/scordis.R

scordisR Documentation

Score distances (SD) in a PCA or PLS score space

Description

scordis calculates score distances (SD) from a PCA or PLS model, i.e. the Mahalanobis distances between the projections of the row observations on the score space and the center of the score space.

lscordis does the same calculation for each local model (i.e. for each new observation to predict) generated by functions locw, lwplsr, etc.

Usage


scordis(fm, 
    ncomp = NULL, 
    robust = FALSE, alpha = .01)

lscordis(fm, 
    ncomp = NULL, 
    robust = FALSE, alpha = .01)

Arguments

fm

For scordis, output of functions pca, pls or plsr. For lscordis, output of functions locw, lwplsr, etc.

ncomp

Number of components to consider for the distance calculations. If NULL (default), the maximum number of components is considered.

robust

Logical. If TRUE, the moment estimation of the cutoff (see Details) is robustified. This is recommended in particular after robust PCA or PLS on small data sets containing strong outliers. Default to FALSE.

alpha

Risk I level for defining the cutoff detecting extreme values (see the code).

Details

The cutoff for detecting extreme SD values is computed using a moment estimation of a Chi-squared distrbution for the squared distance (see Pomerantzev 2008).

Column dstand in the output is a "standardized" SD defined as SD / cutoff. A value dstand > 1 may be considered as extreme.

The Winisi "GH" is also provided (considered as extreme if GH > 3).

Value

A list of outputs (see examples).

References

M. Hubert, P. J. Rousseeuw, K. Vanden Branden (2005). ROBPCA: a new approach to robust principal components analysis. Technometrics, 47, 64-79.

Pomerantsev, A.L., 2008. Acceptance areas for multivariate classification derived by projection methods. Journal of Chemometrics 22, 601–609. https://doi.org/10.1002/cem.1147

Examples


n <- 8
p <- 6
set.seed(1)
X <- matrix(rnorm(n * p, mean = 10), ncol = p, byrow = TRUE)
y1 <- 100 * rnorm(n)
y2 <- 100 * rnorm(n)
Y <- cbind(y1, y2)
set.seed(NULL)

Xr <- X[1:6, ] ; Yr <- Y[1:6, ]
Xu <- X[7:8, ] ; Yu <- Y[7:8, ]

fm <- pca(Xr, ncomp = 3)
#fm <- pls(Xr, Yr, ncomp = 3)
scordis(fm)

fm <- pca(Xr, Xu, ncomp = 3)
#fm <- pls(Xr, Yr, Xu, ncomp = 3)
#fm <- plsr(Xr, Yr, Xu, ncomp = 3)
scordis(fm)


mlesnoff/rnirs documentation built on April 24, 2023, 4:17 a.m.