Description Usage Arguments Details Value References Examples
This function employs the PX-CAVI algorithm proposed in Ning (2021). The method uses the sparse spiked-covariance model and the spike and slab prior (see below). Two different slab densities can be used: independent Laplace densities and a multivariate normal density. In Ning (2021), it recommends choosing the multivariate normal distribution. The algorithm allows the user to decide whether she/he wants to center and scale their data. The user is also allowed to change the default values of the parameters of each prior.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
dat |
Data an n*p matrix. |
r |
Rank. |
lambda |
Tuning parameter for the density g. |
slab.prior |
The density g, the default is "MVN", the multivariate normal distribution. Another choice is "Laplace". |
max.iter |
The maximum number of iterations for running the algorithm. |
eps |
The convergence threshold; the default is 10^{-4}. |
jointly.row.sparse |
The default is true, which means that the jointly row sparsity assumption is used; one could not use this assumptio by changing it to false. |
center.scale |
The default if false. If true, then the input date will be centered and scaled. |
sig2.true |
The default is false, σ^2 will be estimated; if sig2 is known and its value is given, then σ^2 will not be estimated. |
threshold |
The threshold to determine whether γ_j is 0 or 1; the default value is 0.5. |
theta.int |
The initial value of theta mean; if not provided, the algorithm will estimate it using PCA. |
theta.var.int |
The initial value of theta.var; if not provided, the algorithm will set it to be 1e-3*diag(r). |
kappa.para1 |
The value of α_1 of π(κ); default is 1. |
kappa.para2 |
The value of α_2 of π(κ); default is p+1. |
sigma.a |
The value of σ_a of π(σ^2); default is 1. |
sigma.b |
The value of σ_b of π(σ^2); default is 2. |
The model is
X_i = θ w_i + σ ε_i
where w_i \sim N(0, I_r), ε \sim N(0,I_p).
The spike and slab prior is given by
π(θ, \boldsymbol γ|λ_1, r) \propto ∏_{j=1}^p ≤ft(γ_j \int_{A \in V_{r,r}} g(θ_j|λ_1, A, r) π(A) d A+ (1-γ_j) δ_0(θ_j)\right)
g(θ_j|λ_1, A, r) = C(λ_1)^r \exp(-λ_1 \|β_j\|_q^m)
γ_j| κ \sim Bernoulli(κ)
κ \sim Beta(α_1, α_2)
σ^2 \sim InvGamma(σ_a, σ_b)
where V_{r,r} = \{A \in R^{r \times r}: A'A = I_r\} and δ_0 is the Dirac measure at zero. The density g can be chosen to be the product of independent Laplace distribution (i.e., q = 1, m =1) or the multivariate normal distribution (i.e., q = 2, m = 2).
iter |
The number of iterations to reach convergence. |
selection |
A vector (if r = 1 or with the jointly row-sparsity assumption) or a matrix (if otherwise) containing the estimated value for \boldsymbol γ. |
loadings |
The loadings matrix. |
uncertainty |
The covariance of each non-zero rows in the loadings matrix. |
scores |
Score functions for the r principal components. |
sig2 |
Variance of the noise. |
obj.fn |
A vector contains the value of the objective function of each iteration. It can be used to check whether the algorithm converges |
Ning, B. (2021). Spike and slab Bayesian sparse principal component analysis. arXiv:2102.00305.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | #In this example, the first 20 rows in the loadings matrix are nonzero, the rank is 2
set.seed(2021)
library(MASS)
library(pracma)
n <- 200
p <- 1000
s <- 20
r <- 2
sig2 <- 0.1
# generate eigenvectors
U.s <- randortho(s, type = c("orthonormal"))
if (r == 1) {
U <- rep(0, p)
U[1:s] <- as.vector(U.s[, 1:r])
} else {
U <- matrix(0, p, r)
U[1:s, ] <- U.s[, 1:r]
}
s.star <- rep(0, p)
s.star[1:s] <- 1
eigenvalue <- seq(20, 10, length.out = r)
# generate Sigma
if (r == 1) {
theta.true <- U * sqrt(eigenvalue)
Sigma <- tcrossprod(theta.true) + sig2*diag(p)
} else {
theta.true <- U %*% sqrt(diag(eigenvalue))
Sigma <- tcrossprod(theta.true) + sig2 * diag(p)
}
# generate n*p dataset
X <- t(mvrnorm(n, mu = rep(0, p), Sigma = Sigma))
result <- VBsparsePCA(dat = t(X), r = 2, jointly.row.sparse = TRUE, center.scale = FALSE)
loadings <- result$loadings
scores <- result$scores
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.