sp_msfa: Posterior sampling for the Bayesian MSFA model for sparse...

Description Usage Arguments Value References

View source: R/Bayesian.R

Description

The function implements the Gibbs sampler described in De Vito et al. (2020). The code is suitable for small to moderate-size data, and therefore can readily reproduce some of the results of the paper, but not those for large data which would require larger computational times. The outputlevel argument has an important role for practical usage. A value outputlevel = 1 (the default) will save all the MCMC chains, and this would create a rather bulky output. The option outputlevel = 2 will save only the chains for the loading matrix of common factors, whereas option outputlevel = 3 will not save any chain, reporting in the output also the posterior means of the crossproduct of the loading matrices.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
sp_msfa(
  X_s,
  k,
  j_s,
  trace = TRUE,
  nprint = 1000,
  outputlevel = 1,
  control = list(...),
  ...
)

Arguments

X_s

List of lenght S, corresponding to number of different studies considered. Each element of the list contains a data matrix, with the same number of columns P for all the studies. Standardization is carried out by the function.

k

Number of common factors.

j_s

Number of study-specific factors. A vector of positive integers of length S.

trace

If TRUE then trace information is being printed every nprint iterations of the Gibbs sampling. Default is TRUE.

nprint

Frequency of tracing information. Default is every 1000 iterations.

outputlevel

Detailed level of output data. See Details. Default is 1.

control

A list of hyperparameters for the prior distributions and for controlling the Gibbs sampling. See sp_msfa_control.

...

Arguments to be used to form the default control argument if it is not supplied directly.

Value

A list containing the posterior samples for the model parameters. If outputlevel = 1, the components of the list are:

Phi

Common factor loadings. An array of dimension p x k x (nrun - burn)/thin.

Lambda

Study-specific factor loadings. A list of arrays of dimension p x j_s[s] x (nrun - burn)/thin.

psi

Study-specific uniquenesses. A list of arrays of dimension p x 1 x (nrun - burn)/thin.

f_s

Study-specific latent factors associated to common factor loadings. A list of arrays of dimension nrow(X_s[[s]]) x k x (nrun - burn)/thin.

l_s

Study-specific latent factors associated to study-specific factor loadings. A list of arrays of dimension nrow(X_s[[s]]) x j_s[s] x (nrun - burn)/thin.

When instead outputlevel > 1, the arrays are replaced by posterior means. The matrices SigmaPhi or the list of matrices SigmaLambda, containing the posterior means of the these quantities, will be returned when outputlevel is different from 1.

References

De Vito, R., Bellio, R., Trippa, L. and Parmigiani, G. (2020). Bayesian Multi-study Factor Analysis for High-throughput Biological Data. Submitted manuscript.


rdevito/MSFA documentation built on March 18, 2020, 2:57 p.m.