cpd: N-way Canonical Polyadic Decomposition

Description Usage Arguments Details Value Warnings Note Author(s) References See Also Examples

Description

Fits Frank L. Hitchcock's Canonical Polyadic Decomposition (CPD) to N-way data arrays. Parameters are estimated via alternating least squares.

Usage

1
2
3
cpd(X, nfac, nstart = 10, maxit = 500, 
    ctol = 1e-4, parallel = FALSE, cl = NULL, 
    output = "best", verbose = TRUE)

Arguments

X

N-way data array. Missing data are allowed (see Note).

nfac

Number of factors.

nstart

Number of random starts.

maxit

Maximum number of iterations.

ctol

Convergence tolerance (R^2 change).

parallel

Logical indicating if parLapply should be used. See Examples.

cl

Cluster created by makeCluster. Only used when parallel=TRUE.

output

Output the best solution (default) or output all nstart solutions.

verbose

If TRUE, fitting progress is printed via txtProgressBar. Ignored if parallel=TRUE.

Details

This is an N-way extension of the parafac function without constraints. The form of the CPD for 3-way and 4-way data is given in the documentation for the parafac function. For N > 4, the CPD has the form

X[i.1, ..., i.N] = sum A.1[i.1,r] * ... * A.N[i.N,r] + E[i.1, ..., i.N]

where A.n are the n-th mode's weights for n = 1, ..., N, and E is the N-way residual array. The summation is for r = seq(1,R).

Value

A

List of length N containing the weights for each mode.

SSE

Sum of Squared Errors.

Rsq

R-squared value.

GCV

Generalized Cross-Validation.

edf

Effective degrees of freedom.

iter

Number of iterations.

cflag

Convergence flag. See Note.

Warnings

The algorithm can perform poorly if the number of factors nfac is set too large.

Note

Missing data should be specified as NA values in the input X. The missing data are randomly initialized and then iteratively imputed as a part of the algorithm.

Output cflag gives convergence information: cflag = 0 if algorithm converged normally and cflag = 1 if maximum iteration limit was reached before convergence.

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multimodal factor analysis. UCLA Working Papers in Phonetics, 16, 1-84.

Harshman, R. A., & Lundy, M. E. (1994). PARAFAC: Parallel factor analysis. Computational Statistics and Data Analysis, 18, 39-72.

Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6, 164-189.

See Also

The parafac function provides a more flexible implemention for 3-way and 4-way arrays.

The fitted.cpd function creates the model-implied fitted values from a fit "cpd" object.

The resign.cpd function can be used to resign factors from a fit "cpd" object.

The rescale.cpd function can be used to rescale factors from a fit "cpd" object.

The reorder.cpd function can be used to reorder factors from a fit "cpd" object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
##########   3-way example   ##########

# create random data array with CPD/Parafac structure
set.seed(3)
mydim <- c(50, 20, 5)
nf <- 3
Amat <- matrix(rnorm(mydim[1]*nf), nrow = mydim[1], ncol = nf)
Bmat <- matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf)
Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf)
Xmat <- tcrossprod(Amat, krprod(Cmat, Bmat))
Xmat <- array(Xmat, dim = mydim)
Emat <- array(rnorm(prod(mydim)), dim = mydim)
Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat))   # SNR = 1
X <- Xmat + Emat

# fit CPD model
set.seed(0)
cano <- cpd(X, nfac = nf, nstart = 1)
cano

# fit Parafac model
set.seed(0)
pfac <- parafac(X, nfac = nf, nstart = 1)
pfac


##########   4-way example   ##########

# create random data array with CPD/Parafac structure
set.seed(4)
mydim <- c(30,10,8,10)
nf <- 4
aseq <- seq(-3, 3, length.out = mydim[1])
Amat <- cbind(dnorm(aseq), dchisq(aseq+3.1, df=3),
              dt(aseq-2, df=4), dgamma(aseq+3.1, shape=3, rate=1))
Bmat <- svd(matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf), nv = 0)$u
Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf)
Cstruc <- Cmat > 0.5
Cmat <- Cmat * Cstruc
Dmat <- matrix(runif(mydim[4]*nf), nrow = mydim[4], ncol = nf)
Xmat <- tcrossprod(Amat, krprod(Dmat, krprod(Cmat, Bmat)))
Xmat <- array(Xmat, dim = mydim)
Emat <- array(rnorm(prod(mydim)), dim = mydim)
Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat))   # SNR = 1
X <- Xmat + Emat

# fit CPD model
set.seed(0)
cano <- cpd(X, nfac = nf, nstart = 1)
cano

# fit Parafac model
set.seed(0)
pfac <- parafac(X, nfac = nf, nstart = 1)
pfac


##########   5-way example   ##########

# create random data array with CPD/Parafac structure
set.seed(5)
mydim <- c(5, 6, 7, 8, 9)
nmode <- length(mydim)
nf <- 3
Amat <- vector("list", nmode)
for(n in 1:nmode) {
  Amat[[n]] <- matrix(rnorm(mydim[n] * nf), mydim[n], nf)
}
Zmat <- krprod(Amat[[3]], Amat[[2]])
for(n in 4:5) Zmat <- krprod(Amat[[n]], Zmat)
Xmat <- tcrossprod(Amat[[1]], Zmat)
Xmat <- array(Xmat, dim = mydim)
Emat <- array(rnorm(prod(mydim)), dim = mydim)
Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat))   # SNR = 1
X <- Xmat + Emat

# fit CPD model
set.seed(0)
cano <- cpd(X, nfac = nf, nstart = 1)
cano

multiway documentation built on May 2, 2019, 6:47 a.m.