Description Usage Arguments Details Value Warnings Note Author(s) References See Also Examples
Fits Richard A. Harshman's Parallel Factors (Parafac) model to 3-way or 4-way data arrays. Parameters are estimated via alternating least squares with optional constraints.
1 2 3 4 5 6 7 | parafac(X, nfac, nstart = 10, const = NULL, control = NULL,
Afixed = NULL, Bfixed = NULL, Cfixed = NULL, Dfixed = NULL,
Astart = NULL, Bstart = NULL, Cstart = NULL, Dstart = NULL,
Astruc = NULL, Bstruc = NULL, Cstruc = NULL, Dstruc = NULL,
Amodes = NULL, Bmodes = NULL, Cmodes = NULL, Dmodes = NULL,
maxit = 500, ctol = 1e-4, parallel = FALSE, cl = NULL,
output = c("best", "all"), verbose = TRUE, backfit = FALSE)
|
X |
Three-way data array with |
nfac |
Number of factors. |
nstart |
Number of random starts. |
const |
Character vector of length 3 or 4 giving the constraints for each mode (defaults to unconstrained). See |
control |
List of parameters controlling options for smoothness constraints. This is passed to |
Afixed |
Used to fit model with fixed Mode A weights. |
Bfixed |
Used to fit model with fixed Mode B weights. |
Cfixed |
Used to fit model with fixed Mode C weights. |
Dfixed |
Used to fit model with fixed Mode D weights. |
Astart |
Starting Mode A weights. Default uses random weights. |
Bstart |
Starting Mode B weights. Default uses random weights. |
Cstart |
Starting Mode C weights. Default uses random weights. |
Dstart |
Starting Mode D weights. Default uses random weights. |
Astruc |
Structure constraints for Mode A weights. See Note. |
Bstruc |
Structure constraints for Mode B weights. See Note. |
Cstruc |
Structure constraints for Mode C weights. See Note. |
Dstruc |
Structure constraints for Mode D weights. See Note. |
Amodes |
Mode ranges for Mode A weights (for unimodality constraints). See Note. |
Bmodes |
Mode ranges for Mode B weights (for unimodality constraints). See Note. |
Cmodes |
Mode ranges for Mode C weights (for unimodality constraints). See Note. |
Dmodes |
Mode ranges for Mode D weights (for unimodality constraints). See Note. |
maxit |
Maximum number of iterations. |
ctol |
Convergence tolerance (R^2 change). |
parallel |
Logical indicating if |
cl |
Cluster created by |
output |
Output the best solution (default) or output all |
verbose |
If |
backfit |
Should backfitting algorithm be used for |
Given a 3-way array X = array(x, dim = c(I,J,K))
, the 3-way Parafac model can be written as
X[i,j,k] = sum A[i,r]*B[j,r]*C[k,r] + E[i,j,k]
|
where A = matrix(a,I,R)
are the Mode A (first mode) weights, B = matrix(b,J,R)
are the Mode B (second mode) weights, C = matrix(c,K,R)
are the Mode C (third mode) weights, and E = array(e,dim=c(I,J,K))
is the 3-way residual array. The summation is for r = seq(1,R)
.
Given a 4-way array X = array(x, dim = c(I,J,K,L))
, the 4-way Parafac model can be written as
X[i,j,k,l] = sum A[i,r]*B[j,r]*C[k,r]*D[l,r] + E[i,j,k,l]
|
where D = matrix(d,L,R)
are the Mode D (fourth mode) weights, E = array(e,dim=c(I,J,K,L))
is the 4-way residual array, and the other terms can be interprered as previously described.
Weight matrices are estimated using an alternating least squares algorithm with optional constraints.
If output = "best"
, returns an object of class "parafac"
with the following elements:
A |
Mode A weight matrix. |
B |
Mode B weight matrix. |
C |
Mode C weight matrix. |
D |
Mode D weight matrix. |
SSE |
Sum of Squared Errors. |
Rsq |
R-squared value. |
GCV |
Generalized Cross-Validation. |
edf |
Effective degrees of freedom. |
iter |
Number of iterations. |
cflag |
Convergence flag. See Note. |
const |
See argument |
control |
See argument |
fixed |
Logical vector indicating whether 'fixed' weights were used for each mode. |
struc |
Logical vector indicating whether 'struc' constraints were used for each mode. |
Otherwise returns a list of length nstart
where each element is an object of class "parafac"
.
The algorithm can perform poorly if the number of factors nfac
is set too large.
Missing data should be specified as NA
values in the input X
. The missing data are randomly initialized and then iteratively imputed as a part of the algorithm.
Structure constraints should be specified with a matrix of logicals (TRUE/FALSE), such that FALSE elements indicate a weight should be constrained to be zero. Default uses unstructured weights, i.e., a matrix of all TRUE values.
When using unimodal constraints, the *modes
inputs can be used to specify the mode search range for each factor. These inputs should be matrices with dimension c(2,nfac)
where the first row gives the minimum mode value and the second row gives the maximum mode value (with respect to the indicies of the corresponding weight matrix).
Output cflag
gives convergence information: cflag = 0
if algorithm converged normally, cflag = 1
if maximum iteration limit was reached before convergence, and cflag = 2
if algorithm terminated abnormally due to a problem with the constraints.
Nathaniel E. Helwig <helwig@umn.edu>
Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an "explanatory" multimodal factor analysis. UCLA Working Papers in Phonetics, 16, 1-84.
Harshman, R. A., & Lundy, M. E. (1994). PARAFAC: Parallel factor analysis. Computational Statistics and Data Analysis, 18, 39-72.
Helwig, N. E. (2017). Estimating latent trends in multivariate longitudinal data via Parafac2 with functional and structural constraints. Biometrical Journal, 59(4), 783-803.
Helwig, N. E. (in prep). Constrained parallel factor analysis via the R package multiway.
Hitchcock, F. L. (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6, 164-189.
The cpd
function implements an N-way extension without constraints.
The fitted.parafac
function creates the model-implied fitted values from a fit "parafac"
object.
The resign.parafac
function can be used to resign factors from a fit "parafac"
object.
The rescale.parafac
function can be used to rescale factors from a fit "parafac"
object.
The reorder.parafac
function can be used to reorder factors from a fit "parafac"
object.
The cmls
function (from CMLS package) is called as a part of the alternating least squares algorithm.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | ########## 3-way example ##########
# create random data array with Parafac structure
set.seed(3)
mydim <- c(50, 20, 5)
nf <- 3
Amat <- matrix(rnorm(mydim[1]*nf), nrow = mydim[1], ncol = nf)
Bmat <- matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf)
Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf)
Xmat <- tcrossprod(Amat, krprod(Cmat, Bmat))
Xmat <- array(Xmat, dim = mydim)
Emat <- array(rnorm(prod(mydim)), dim = mydim)
Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1
X <- Xmat + Emat
# fit Parafac model (unconstrained)
pfac <- parafac(X, nfac = nf, nstart = 1)
pfac
# fit Parafac model (non-negativity on Modes B and C)
pfacNN <- parafac(X, nfac = nf, nstart = 1,
const = c("uncons", "nonneg", "nonneg"))
pfacNN
# check solution
Xhat <- fitted(pfac)
sum((Xmat - Xhat)^2) / prod(mydim)
# reorder and resign factors
pfac$B[1:4,]
pfac <- reorder(pfac, c(3,1,2))
pfac$B[1:4,]
pfac <- resign(pfac, mode="B")
pfac$B[1:4,]
Xhat <- fitted(pfac)
sum((Xmat - Xhat)^2) / prod(mydim)
# rescale factors
colSums(pfac$B^2)
colSums(pfac$C^2)
pfac <- rescale(pfac, mode = "C", absorb = "B")
colSums(pfac$B^2)
colSums(pfac$C^2)
Xhat <- fitted(pfac)
sum((Xmat - Xhat)^2) / prod(mydim)
########## 4-way example ##########
# create random data array with Parafac structure
set.seed(4)
mydim <- c(30,10,8,10)
nf <- 4
aseq <- seq(-3, 3, length.out = mydim[1])
Amat <- cbind(dnorm(aseq), dchisq(aseq+3.1, df=3),
dt(aseq-2, df=4), dgamma(aseq+3.1, shape=3, rate=1))
Bmat <- svd(matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf), nv = 0)$u
Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf)
Cstruc <- Cmat > 0.5
Cmat <- Cmat * Cstruc
Dmat <- matrix(runif(mydim[4]*nf), nrow = mydim[4], ncol = nf)
Xmat <- tcrossprod(Amat, krprod(Dmat, krprod(Cmat, Bmat)))
Xmat <- array(Xmat, dim = mydim)
Emat <- array(rnorm(prod(mydim)), dim = mydim)
Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1
X <- Xmat + Emat
# fit Parafac model (unimodal and smooth A, orthogonal B,
# non-negative and structured C, non-negative D)
pfac <- parafac(X, nfac = nf, nstart = 1, Cstruc = Cstruc,
const = c("unismo", "orthog", "nonneg", "nonneg"))
pfac
# check solution
Xhat <- fitted(pfac)
sum((Xmat - Xhat)^2) / prod(mydim)
congru(Amat, pfac$A)
crossprod(pfac$B)
pfac$C
Cstruc
## Not run:
########## parallel computation ##########
# create random data array with Parafac structure
set.seed(3)
mydim <- c(50,20,5)
nf <- 3
Amat <- matrix(rnorm(mydim[1]*nf), nrow = mydim[1], ncol = nf)
Bmat <- matrix(runif(mydim[2]*nf), nrow = mydim[2], ncol = nf)
Cmat <- matrix(runif(mydim[3]*nf), nrow = mydim[3], ncol = nf)
Xmat <- tcrossprod(Amat, krprod(Cmat, Bmat))
Xmat <- array(Xmat, dim = mydim)
Emat <- array(rnorm(prod(mydim)), dim = mydim)
Emat <- nscale(Emat, 0, ssnew = sumsq(Xmat)) # SNR = 1
X <- Xmat + Emat
# fit Parafac model (10 random starts -- sequential computation)
set.seed(1)
system.time({pfac <- parafac(X, nfac = nf)})
pfac
# fit Parafac model (10 random starts -- parallel computation)
cl <- makeCluster(detectCores())
ce <- clusterEvalQ(cl, library(multiway))
clusterSetRNGStream(cl, 1)
system.time({pfac <- parafac(X, nfac = nf, parallel = TRUE, cl = cl)})
pfac
stopCluster(cl)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.