ZIPPCAlnm: ZIPPCAlnm

Description Usage Arguments Value Examples

View source: R/ZIPPCAlnm.R

Description

Zero-Inflated Probabilistic PCA framework with logistical normal multinomial distribution (ZIPPCA-LNM), that extends probabilistic PCA from the Gaussian setting to multivariate abundance data, and an empirical Bayes approach was proposed for inferring microbial compositions. An efficient VA algorithm, classification VA, has been developed for fitting this model.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
ZIPPCAlnm(
  X,
  V = NULL,
  n.factors = 2,
  rank = FALSE,
  trace = FALSE,
  maxit = 100,
  parallel = TRUE,
  sd.errors = FALSE,
  level = 0.95
)

Arguments

X

count matrix of observations.

V

vector of the sample covariate.

n.factors

the rank or number of factors, after dimensional reduction. Defaults to 2.

rank

FALSE, "BIC" or "CV". Indicating whether the rank or number of factors, is chosen from 1 to 5. Options are "BIC" (Bayesian information criterion), and "CV" (Cross-validation). BIC is recommended. Defaults to FALSE.

trace

logical, defaults to FALSE. if TRUE each current iteration step information will be printed.

maxit

maximum number of iterations within optim and constrOptim function, defaults to 100.

parallel

logical, if TRUE, use parallel toolbox to accelerate.

sd.errors

logical, defaults to FALSE. if TRUE the sandwich estimators will be calculated.

level

the confidence level of variational confidence interval. Defaults to 0.95.

Value

VLB

variational lower bound of log likelihood

lvs

list of latent variables

  • pi the probabilities of excess zeros

  • factor_scores coordinates or factor scores in low-dimensional subspace

params

list of model parameters

  • factor_coefs_j coefficients of latent variables fator scores or factor loadings

  • factor_coefs_0 taxon-specific intercepts

  • gamma coeffcients of sample covariate

Q

the underlying composition of microbiome data

bic

the number of the rank selection, chosen by BIC type information criterion

cv

the number of the rank selection, chosen by cross-validation

sd

standard deviation of the sandwich estimators

conf

variational confidence intervals of model parameters

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
n.n = 50
n.w = 100
n.factors = 2
set.seed(1)
si <- diag(n.factors)
me <- c(0,0)
f <- matrix(0,nrow = n.n, ncol = n.factors)
for(i in 1:n.n){
 f[i,] <- mvrnorm(1,me,si)
}
betaj <- matrix(0,nrow = n.w, ncol = n.factors)
for(j in 1:n.w){
  betaj[j,] <- runif(n.factors,-3.5,3.5)
}
beta0 <- rep(2,n.w)
l <- matrix(beta0,n.n,n.w,byrow=TRUE)+f %*%  t(betaj)
eta_j <- rep(0.25,n.w)
z <- matrix(0,n.n,n.w,byrow = TRUE)
for(i in 1:n.n){
  z[i,] <- rbinom(n.w, size=1, prob=eta_j)
}
sum <- rowSums(exp(l))
Qn <- exp(l)/sum
sum <- matrix(rowSums((1-z)*exp(l)),n.n,n.w)
Qn_z <- matrix(I(z==0)*(exp(l)/sum)+I(z==1)*0,n.n,n.w)
X <- matrix(0,n.n,n.w,byrow = TRUE)
for(i in 1:n.n){
  X[i,] <- rmultinom(1, size = runif(1,800,1000), prob = Qn_z[i,])
}
zerorow <- which(rowSums(X)==0)
if(length(zerorow) >0 ){
  X <- X[-zerorow,];Qn <- Qn[-zerorow,];Qn_z <- Qn_z[-zerorow,];
}
zerocol <- which(colSums(X)==0)
if(length(zerocol) >0 ){
  X <- X[,-zerocol];Qn <- Qn[,-zerocol];Qn_z <- Qn_z[,-zerocol];
}
result <- ZIPPCAlnm::ZIPPCAlnm(X)

YanyZeng/ZIPPCAlnm documentation built on May 9, 2021, 3:55 p.m.