fit.dependency.model: Fit dependency model between two data sets.

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Fit generative latent variable model (see vignette for model specification) on two data sets. Regularize the solutions with priors, including constraints on marginal covariance structures, the structure of W, latent dimensionality etc. Probabilistic versions of PCA, factor analysis and CCA are available as special cases.

Usage

1
2
3
4
5
6
7
fit.dependency.model(X, Y, zDimension = 1, marginalCovariances = "full",
                     epsilon = 1e-3,
                     priors = list(), matched = TRUE,
                     includeData = TRUE, calculateZ = TRUE, verbose = FALSE)
ppca(X, Y = NULL, zDimension = NULL, includeData = TRUE, calculateZ = TRUE)
pfa(X, Y = NULL, zDimension = NULL, includeData = TRUE, calculateZ = TRUE, priors = NULL)
pcca(X, Y, zDimension = NULL, includeData = TRUE, calculateZ = TRUE)

Arguments

X, Y

Data set/s X and Y. 'Variables x samples'. The second data set (Y) is optional.

zDimension

Dimensionality of the shared latent variable.

marginalCovariances

Structure of marginal covariances, assuming multivariate Gaussian distributions for the dataset-specific effects. Options: "identical isotropic", "isotropic", "diagonal" and "full". The difference between isotropic and identical isotropic options is that in isotropic model, phi$X != phi$Y in general, whereas with isotropic model phi$X = phi$Y.

epsilon

Convergence limit.

priors

Prior parameters for the model. A list, which can contain some of the following elements:

W

Rate parameter for exponential distribution (should be positive). Used to specify the prior for Wx and Wy in the dependency model. The exponential prior is used to produce non-negative solutions for W; small values of the rate parameter correspond to an uninformative prior distribution.

Nm.wxwy.mean

Mean of the matrix normal prior distribution for the transformation matrix T. Must be a matrix of size (variables in first data set) x (variables in second data set). If value is 1, Nm.wxwy.mean will be made identity matrix of appropriate size.

Nm.wxwy.sigma

Variance parameter for the matrix normal prior distribution of the transformation matrix T. Described the allowed deviation scale of the transformation matrix T from the mean matrix Nm.wxwy.mean.

matched

Logical indicating if the variables (dimensions) are matched between X and Y. Applicable only when dimX = dimY. Affects the results only when prior on the relationship Wx ~ Wy is set, i.e. when priors$Nm.wx.wy.sigma < Inf.

includeData

Logical indicating whether the original data is included to the model output. Using FALSE can be used to save memory.

calculateZ

Logical indicating whether an expectation of the latent variable Z is included in the model output. Otherwise the expectation can be calculated with getZ or z.expectation. Using FALSE speeds up the calculation of the dependency model.

verbose

Follow procedure by intermediate messages.

Details

The fit.dependency.model function fits the dependency model X = N(W$X * Z, phi$X); Y = N(W$Y * Z, phi$Y) with the possibility to tune the model structure and parameter priors.

In particular, the dataset-specific covariance structure phi can be defined; non-negative priors for W are possible; the relation between W$X and W$Y can be tuned. For a comprehensive set of examples, see the example scripts in the tests/ directory of this package.

Special cases of the model, obtained with particular prior assumptions, include probabilistic canonical correlation analysis (pcca; Bach & Jordan 2005), probabilistic principal component analysis (ppca; Tipping & Bishop 1999), probabilistic factor analysis (pfa; Rubin & Thayer 1982), and a regularized version of canonical correlation analysis (pSimCCA; Lahti et al. 2009).

The standard probabilistic PCA and factor analysis are methods for a single data set (X ~ N(WZ, phi)), with isotropic and diagonal covariance (phi) for pPCA and pFA, respectively. Analogous models for two data sets are obtained by concatenating the two data sets, and performing pPCA or pFA.

Such special cases are obtained with the following choices in the fit.dependency.model function:

pPCA

marginalCovariances = "identical isotropic" (Tipping & Bishop 1999)

pFA

marginalCovariances = "diagonal" (Rubin & Thayer 1982)

pCCA

marginalCovariances = "full" (Bach & Jordan 2005)

pSimCCA

marginaCovariances = "full", priors = list(Nm.wxwy.mean = I, Nm.wxwy.sigma = 0). This is the default method, corresponds to the case with W$X = W$Y. (Lahti et al. 2009)

pSimCCA with T prior

marginalCovariances = "isotropic", priors = list(Nm.wxwy.mean = 1, Nm.wx.wy.sigma = 1 (Lahti et al. 2009)

To avoid computational singularities, the covariance matrix phi is regularised by adding a small constant to the diagonal.

Value

DependencyModel

Author(s)

Olli-Pekka Huovilainen ohuovila@gmail.com and Leo Lahti leo.lahti@iki.fi

References

Dependency Detection with Similarity Constraints, Lahti et al., 2009 Proc. MLSP'09 IEEE International Workshop on Machine Learning for Signal Processing, http://arxiv.org/abs/1101.5919

A Probabilistic Interpretation of Canonical Correlation Analysis, Bach Francis R. and Jordan Michael I. 2005 Technical Report 688. Department of Statistics, University of California, Berkley. http://www.di.ens.fr/~fbach/probacca.pdf

Probabilistic Principal Component Analysis, Tipping Michael E. and Bishop Christopher M. 1999. Journal of the Royal Statistical Society, Series B, 61, Part 3, pp. 611–622. http://research.microsoft.com/en-us/um/people/cmbishop/downloads/Bishop-PPCA-JRSS.pdf

EM Algorithms for ML Factorial Analysis, Rubin D. and Thayer D. 1982. Psychometrika, vol. 47, no. 1.

See Also

Output class for this function: DependencyModel. Special cases: ppca, pfa, pcca

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(modelData) # Load example data X, Y

# probabilistic CCA
model <- pcca(X, Y)

# dependency model with priors (W>=0; Wx = Wy; full marginal covariances)
model <- fit.dependency.model(X, Y, zDimension = 1, 
      	 		      priors = list(W = 1e-3, Nm.wx.wy.sigma = 0), 
			      marginalCovariances = "full")

# Getting the latent variable Z when it has been calculated with the model
#getZ(model)

dmt documentation built on May 1, 2019, 8:12 p.m.