Parafac: Robust Parafac estimator for compositional data

View source: R/Parafac.R

ParafacR Documentation

Robust Parafac estimator for compositional data

Description

Compute a robust Parafac model for compositional data

Usage

Parafac(X, ncomp = 2, center = FALSE, 
    center.mode = c("A", "B", "C", "AB", "AC", "BC", "ABC"),
    scale=FALSE, scale.mode=c("B", "A", "C"), 
    const="none", conv = 1e-06, start="svd", maxit=10000, 
    optim=c("als", "atld", "int2"),
    robust = FALSE, coda.transform=c("none", "ilr", "clr"), 
    ncomp.rpca = 0, alpha = 0.75, robiter = 100, crit=0.975, trace = FALSE)

Arguments

X

3-way array of data

ncomp

Number of components

center

Whether to center the data

center.mode

If centering the data, on which mode to do this

scale

Whether to scale the data

scale.mode

If scaling the data, on which mode to do this

const

Optional constraints for each mode. Can be a three element character vector or a single character, one of "none" for no constraints (default), "orth" for orthogonality constraints, "nonneg" for nonnegativity constraints or "zerocor" for zero correlation between the extracted factors. For example, const="orth" means orthogonality constraints for all modes, while const=c("orth", "none", "none") sets the orthogonality constraint only for mode A.

conv

Convergence criterion, defaults to 1e-6

start

Initial values for the A, B and C components. Can be "svd" for starting point of the algorithm from SVD's, "random" for random starting point (orthonormalized component matrices or nonnegative matrices in case of nonnegativity constraint), or a list containing user specified components.

maxit

Maximum number of iterations, default is maxit=10000.

optim

How to optimize the CP loss function, default is to use ALS, i.e. optim="als". Other optins are ATLD (optim="atld") and INT2 (optim="INT2"). Please note that ATLD cannot be used with the robust option.

robust

Whether to apply a robust estimation

coda.transform

If the data are a composition, use an ilr or clr transformation. Default is non-compositional data, i.e. coda.transform="none"

ncomp.rpca

Number of components for robust PCA

alpha

Measures the fraction of outliers the algorithm should resist. Allowed values are between 0.5 and 1 and the default is 0.75

robiter

Maximal number of iterations for robust estimation

crit

Cut-off for identifying outliers, default crit=0.975

trace

Logical, provide trace output

Details

The function can compute four versions of the Parafac model:

  1. Classical Parafac,

  2. Parafac for compositional data,

  3. Robust Parafac and

  4. Robust Parafac for compositional data.

This is controlled though the paramters robust=TRUE and coda.transform=c("none", "ilr").

Value

An object of class "parafac" which is basically a list with components:

fit

Fit value

fp

Fit percentage

ss

Sum of squares

A

Orthogonal loading matrix for the A-mode

B

Orthogonal loading matrix for the A-mode

Bclr

Orthogonal loading matrix for the B-mode, clr transformed. Available only if coda.transform="ilr", otherwise NULL

C

Orthogonal loading matrix for the C-mode

Xhat

(Robustly) reconstructed array

const

Optional constraints (same as the input parameter)

iter

Number of iterations

rd

Residual distances

sd

Score distances

flag

The observations whose residual distance rd is larger than cutoff.rd or score distance sd is larger than cutoff.sd, can be considered outliers and receive a flag equal to zero. The regular observations receive a flag 1

robust

The paramater robust, whether robust method is used or not

coda.transform

Which coda transformation is used, can be coda.transform=c("none", "ilr", "clr").

Author(s)

Valentin Todorov valentin.todorov@chello.at and Maria Anna Di Palma madipalma@unior.it and Michele Gallo mgallo@unior.it

References

Harshman, R.A. (1970). Foundations of Parafac procedure: models and conditions for an "explanatory" multi-mode factor analysis. UCLA Working Papers in Phonetics, 16: 1–84.

Engelen, S., Frosch, S. and Jorgensen, B.M. (2009). A fully robust PARAFAC method analyzing fluorescence data. Journal of Chemometrics, 23(3): 124–131.

Kroonenberg, P.M. (1983).Three-mode principal component analysis: Theory and applications (Vol. 2), DSWO press.

Rousseeuw, P.J. and Driessen, K.V. (1999). A fast algorithm for the minimum covariance determinant estimator. Technometrics, 41(3): 212–223.

Egozcue J.J., Pawlowsky-Glahn V., Mateu-Figueras G. and Barcel'o-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3): 279-300

Examples

#############
##
## Example with the UNIDO Manufacturing value added data

data(va3way)
dim(va3way)

## Treat quickly and dirty the zeros in the data set (if any) 
va3way[va3way==0] <- 0.001

## 
res <- Parafac(va3way)
res
print(res$fit)
print(res$A)

## Distance-distance plot
plot(res, which="dd", main="Distance-distance plot")

data(ulabor)
res <- Parafac(ulabor, robust=TRUE, coda.transform="ilr")
res

## Plot Orthonormalized A-mode component plot
plot(res, which="comp", mode="A", main="Component plot, A-mode")

## Plot Orthonormalized B-mode component plot
plot(res, which="comp", mode="B", main="Component plot, B-mode")

## Plot Orthonormalized C-mode component plot
plot(res, which="comp", mode="C", main="Component plot, C-mode")



rrcov3way documentation built on July 9, 2023, 7:44 p.m.