# BCSub: A Bayesian semiparametric factor analysis model for subtype... In BCSub: A Bayesian Semiparametric Factor Analysis Model for Subtype Identification (Clustering)

## Description

A Bayesian semiparametric factor analysis model for subtype identification (Clustering).

## Usage

 `1` ```BCSub(A = NULL, iter = 1000, seq = 200:1000, M = 5) ```

## Arguments

 `A` Data matrix with rows being subjects and columns being genes. `iter` Total number of iterations (including burn-in period). `seq` Posterior samples used for inference of cluster structure. `M` Number of factors.

## Value

returns a list with following objects.

 `CL` Inferred cluster strucutre based on the posterior samples. `E` A matrix with each column being the cluster structre at each iteration.

## References

A Bayesian Semiparametric Factor Analysis Model for Subtype Identification. Jiehuan Sun, Joshua L. Warren, and Hongyu Zhao.

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45``` ```set.seed(1) n = 100 ## number of subjects G = 200 ## number of genes SNR = 0 ## ratio of noise genes ## loading matrix with four factors lam = matrix(0,G,4) lam[1:(G/4),1] = runif(G/4,-3,3) lam[(G/4+1):(G/2),2] = runif(G/4,-3,3) lam[(G/2+1):(3*G/4),3] = runif(G/4,-3,3) lam[(3*G/4+1):(G),4] = runif(G/4,-3,3) ## generate low-rank covariance matrix sigma <- lam%*%t(lam) + diag(rep(1,G)) sigma <- cov2cor(sigma) ## true cluster structure ## e.true = c(rep(1,n/2),rep(2,n/2)) ## generate data matrix ## mu1 = rep(1,G) mu1[sample(1:G,SNR*G)] = 0 mu2 <- rep(0,G) A = rbind(mvrnorm(n/2,mu1,sigma),mvrnorm(n/2,mu2,sigma)) ## factor analysis to decide the number of factors ## Not run: ev = eigen(cor(A)) ap = parallel(subject=nrow(A),var=ncol(A),rep=100,cent=.05) nS = nScree(x=ev\$values, aparallel=ap\$eigen\$qevpea) M = nS\$Components[1,3] ## number of factors ## End(Not run) M = 4 ## run BCSub for clustering iters = 1000 ## total number of iterations seq = 600:1000 ## posterior samples used for inference system.time(res <- BCSub(A,iter=iters,seq=seq,M=M)) res\$CL ## inferred cluster structure ## calculate and plot similarity matrix sim = calSim(t(res\$E[,seq])) ## plot similarity matrix x <- rep(1:n,times=n) y <- rep(1:n,each=n) z <- as.vector(sim) levelplot(z~x*y,col.regions=rev(gray.colors(n^2)), xlab = "Subject ID",ylab = "Subject ID") ```

BCSub documentation built on May 2, 2019, 2:49 a.m.