BCSub: A Bayesian semiparametric factor analysis model for subtype...

Description Usage Arguments Value References Examples

Description

A Bayesian semiparametric factor analysis model for subtype identification (Clustering).

Usage

1
BCSub(A = NULL, iter = 1000, seq = 200:1000, M = 5)

Arguments

A

Data matrix with rows being subjects and columns being genes.

iter

Total number of iterations (including burn-in period).

seq

Posterior samples used for inference of cluster structure.

M

Number of factors.

Value

returns a list with following objects.

CL

Inferred cluster strucutre based on the posterior samples.

E

A matrix with each column being the cluster structre at each iteration.

References

A Bayesian Semiparametric Factor Analysis Model for Subtype Identification. Jiehuan Sun, Joshua L. Warren, and Hongyu Zhao.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
set.seed(1)
n = 100 ## number of subjects
G = 200 ## number of genes
SNR = 0 ## ratio of noise genes
## loading matrix with four factors
lam = matrix(0,G,4)
lam[1:(G/4),1] = runif(G/4,-3,3)
lam[(G/4+1):(G/2),2] = runif(G/4,-3,3)
lam[(G/2+1):(3*G/4),3] = runif(G/4,-3,3)
lam[(3*G/4+1):(G),4] = runif(G/4,-3,3)
## generate low-rank covariance matrix
sigma <- lam%*%t(lam) + diag(rep(1,G))
sigma <- cov2cor(sigma)
## true cluster structure ##
e.true = c(rep(1,n/2),rep(2,n/2))

## generate data matrix ##
mu1 = rep(1,G)
mu1[sample(1:G,SNR*G)] = 0
mu2 <- rep(0,G)
A = rbind(mvrnorm(n/2,mu1,sigma),mvrnorm(n/2,mu2,sigma))

## factor analysis to decide the number of factors
## Not run: 
ev = eigen(cor(A))
ap = parallel(subject=nrow(A),var=ncol(A),rep=100,cent=.05)
nS = nScree(x=ev$values, aparallel=ap$eigen$qevpea)
M = nS$Components[1,3] ## number of factors

## End(Not run)
M = 4
## run BCSub for clustering
iters = 1000 ## total number of iterations
seq = 600:1000 ## posterior samples used for inference
system.time(res <- BCSub(A,iter=iters,seq=seq,M=M))
res$CL ## inferred cluster structure

## calculate and plot similarity matrix
sim = calSim(t(res$E[,seq]))

## plot similarity matrix
x <- rep(1:n,times=n)
y <- rep(1:n,each=n)
z <- as.vector(sim)
levelplot(z~x*y,col.regions=rev(gray.colors(n^2)), xlab = "Subject ID",ylab = "Subject ID")

BCSub documentation built on May 2, 2019, 2:49 a.m.