vbpca: Regularized Variational Bayes Principal Compnent Analysis...
In davidevdt/bayespca: A Package for Estimating PCA with Variational Bayes Inference

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/vbpca.functions.R

Estimation of regularized PCA with a Variational Bayes algorithm.

vbpca(X, D = 1, maxIter = 500, tolerance = 1e-05, verbose = FALSE, tau = 1,
     updatetau = FALSE, priorvar = 'invgamma', SVS = FALSE, priorInclusion = 0.5,
     global.var = FALSE, control = list(), suppressWarnings = FALSE)

## Default S3 method:
vbpca(X, D = 1, maxIter = 500, tolerance = 1e-05, verbose = FALSE, tau = 1,
     updatetau = FALSE, priorvar = 'invgamma', SVS = FALSE, priorInclusion = 0.5,
     global.var = FALSE, control = list(), suppressWarnings = FALSE)

## S3 method for class 'vbpca'
print(x, ...)

## S3 method for class 'vbpca'
summary(object, ...)

is.vbpca(object)

`X`	array_like; a real (I, J) data matrix (or data frame) to be reduced.
`D`	integer; the number of components to be computed.
`maxIter`	integer; maximum number of variational algorithm iterations.
`tolerance`	float; stopping criterion for the variational algorithm (relative differences between ELBO values).
`verbose`	bool; logical value which indicates whether the estimation process information should be printed.
`tau`	float; when `priorvar = 'fixed'`, the value to be used to fill the matrix of the prior precisions (inverse variances) of the elements W. When `priorvar = 'invgamma'` or `priorvar = 'jeffrey'`, `tau` is the starting value of the precisions
`updatetau`	bool; when `priorvar = 'fixed'`, it specifies whether the prior variances of the elements of W should be updated via Type-II maximum likelihood.
`priorvar`	character; type of hyperprior for the prior variances of the elements in W: no prior (`priorvar = 'fixed'`), Jeffrey's prior (`priorvar = 'jeffrey'`), Inverse Gamma prior (`priorvar = 'invgamma'`). See `vbpca_control` for the specification of the hyperparameters of the Inverse Gamma distribution.
`SVS`	bool; specifies whether Stochastic Variable Selection (a type of 'spike-and-slab' prior) should be included in the computations of the Variational Bayes algorithm.
`priorInclusion`	float or array_like; in SVS, the prior inclusion probabilities; these can be fixed, or random variables with Beta priors (see `vbpca_control` for further information). When not fixed, the value denotes the starting values of the prior probabilities. The argument can be specified as a scalar, or as a D-dimensional array, in which case the prior inclusion probabilities will be regarded as component-specific.
`global.var`	bool; it specifies whether `tau` should be updated globally (component-specific updates) or locally (element-specific updates).
`control`	list; other control parameters. See `vbpca_control` for further details.
`suppressWarnings`	bool; boolean argument which hides function warnings when `TRUE`.
`x, object`	vbpca oject; an object of class `vbpca`, used as arguments for the `print`, `is.bayespca` and `summary` functions.
`...`	not used.

The function allows performing PCA decomposition of an (I, J) input matrix X. For D principal components, the factorization occurs through:

X = X W P^T + E

where P is the (J, D) orthogonal loading matrix (P^T P = I) and W is the (J, D) weight matrix. E is an (I, J) residual matrix. Principal components are defined by X W. In this context, focus of the inference is on the weight matrix W. The Variational Bayes algorithm treats the elements of W as latent variables; P and sigma^2 (the variance of the residuals) are fixed parameters instead.

In order to regularize the elements of W, a Multivariate Normal (MVN) prior is assumed for the columns of W. The multivariate normals have the 0-vector as mean, and diagonal covariance matrix with variance tau. Different specifications of tau (either fixed, updated via Type-II maximum likelihood, or random with Jeffrey's or Inverse Gamma priors) allows achieving different levels of regularization on the elements of W. Furthermore, tau can be updated with local information, or by sharing information with other elements of the same components of the matrix W (global.var = TRUE). The latter option can be useful when deciding how many components should be used during the estimation stage. When Inverse Gamma priors are specified, its scale hyperparameter (alphatau) is regarded as fixed; while its shape hyperparameter betatau can be fixed or random; in turn, a random betatau can be updated with local, component-specific, or global hyperpriors. See vbpca_control for further details on hyperparameter specification.

When SVS = TRUE, a spike-and-slab priors allows for variable selection on the elements of W. In particular, a mixture prior is imposed on the elements of W; priorInclusion controls the proportions of such prior. Variables not included in the model are assumed to be more likely to come from a Normal distributions with variance tau scaled by a factor v0 (see vbpca_control for the specification of the factor). Similar to tau, priorInclusion can be fixed, or or treated as a random variable with Beta priors. Furthermore, priorInclusion can refer to prior probabilities of the whole model (across all components) when specified as a scalar, or to component-specific prior probabilties when specified as a D-dimensional array.

a vbpca returns a 'vbpca' object, which is a list containing the following elements:

`muW`	array_like; posterior means of the weight matrix; (J, D) dimensional array.
`P`	array_like; point estimate of the (orthogonal) loading matrix; (J, D) dimensional array.
`Tau`	array_like; the point estimates (or posterior means) of the inverse prior variances; depending on the specification of `tau`, it can be a D-dimensional vector or a (J, D) dimensional array.
`sigma2`	float; point estimate of the variance of the residuals.
`HPDI`	list; a list containing the high posterior density intervals of the elements of W.
`priorAlpha`	array_like; Inverse Gamma priors.
`priorBeta`	array_like; a (J, D) or D dimensional array (or a scalar), with the values used for the scale hyperparameters of the Inverse Gamma priors. When `betatau` is a random variable, its posterior means are returned.
`priorInclusion`	array_like; scalar or D dimensional array containing the prior inclusion probabilities used (or estimated) by the model.
`inclusionProbabilities`	array_like; an (J, D) dimensional array containing the estimated posterior inclusion probabilities of the elements of W.
`elbo`	float; evidence lower bound of the model.
`converged`	bool; boolean denoting whether the Variational Bayes algorithm converged within the required number of iterations.
`time`	array_like; computation time of the algorithm.
`priorvar`	character; type of prior variance specified as input by the user.
`global.var`	bool; `global.var` specified as input by the user.
`hypertype`	character; hyperprior type specified as input (in the `control` list) by the user.
`SVS`	bool; boolean denoting whether stochastic variable selection was activated, as required by the user.
`plot`	traceplot of the evidence lower bounds computed across the various iterations of the algorithm.

D. Vidotto <d.vidotto@uvt.nl>

[1] C. M. Bishop. 'Variational PCA'. In Proc. Ninth Int. Conf. on Artificial Neural Networks. ICANN, 1999.
[2] E. I. George, R. E. McCulloch (1993). 'Variable Selection via Gibbs Sampling'. Journal of the American Statistical Association (88), 881-889.

vbpca_control

# Create a synthetic dataset
I <- 1e+3
X1 <- rnorm(I, 0, 50)
X2 <- rnorm(I, 0, 30)
X3 <- rnorm(I, 0, 10)

X <- cbind(X1, X1, X1, X2, X2, X2, X3, X3 )
X <- X + matrix(rnorm(length(X), 0, 1), ncol = ncol(X), nrow = I )

# Estimate the Bayesian PCA model, with Inverse Gamma priors for tau
# and SVS with Beta priors for priorInclusion
ctrl <- vbpca_control( alphatau = 1., betatau = 1e-2, beta1pi = 1., beta2pi = 1.  )
mod <- vbpca(X, D = 3, priorvar = 'invgamma', SVS = TRUE, control = ctrl )
summary(mod)
mod