fit.CLMM.2: Clustering of Linear Mixed Models Method

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Fit a CLMM model for replicated time course data allowing for two sets of time points among biological or technical replicates. Missing value are allowed.

Usage

1
  fit.CLMM.2(data.y1,data.x1,data.z1,data.y2,data.x2,data.z2,n.clst, n.run = 1)

Arguments

data.y1

a three dimensional array of gene expression data, data.y1[j, i, t] for gene j, sample i, and time point t. Missing values should be indicated by "NA". And at least one case not missing in each pair of observations.

data.x1

a three dimensional array of fixed effects (common for all genes), data.x1[i, t, p] for sample i, time point t, and covariate p.

data.z1

a three dimensional array of random effects (common for all genes), data.z1[i, t, q] for sample i, time point t, and covariate p.

data.y2

a three dimensional array of gene expression data, data.y2[j, i, t] for gene j, sample i, and time point t. Missing values should be indicated by "NA". And at least one case not missing in each pair of observations.

data.x2

a three dimensional array of fixed effects (common for all genes), data.x2[i, t, p] for sample i, time point t, and covariate p.

data.z2

a three dimensional array of random effects (common for all genes), data.z2[i, t, q] for sample i, time point t, and covariate p.

n.clst

an integer, number of clusters.

n.run

an integer used to get the starting value for the EM algorithm.

Details

This function implements the Clustering of Linear Mixed Models Method of Qin and Self (2006).

Value

u.hat

a matrix containing the cluster membership for the genes.

b.hat

an array containing the estimated random effects.

theta.hat

a list comprised of five components: zeta.hat, pi.hat, D.hat, sigma2.hat and llh. They are described as below:

zeta.hat

a matrix of the estimated fixed effects with one row for each cluster.

pi.hat

a vector of the relative frequency for each cluster.

D.hat

the estimated random effects variances for the clusters.

sigma2.hat

the estimated measurement error variances for the clusters.

llh

log likelihood for the model.

Author(s)

Li-Xuan Qin qinl@mskcc.org

References

See Also

fit.CLM, fit.CLMM, fit.CLMM.2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
## Not run: 
#test data
 data(YeastCellCycle2)
 data.y1 <- YeastCellCycle2$normalizedData.WT
 data.x1 <- YeastCellCycle2$designMatrix.WT
 data.y2 <- YeastCellCycle2$normalizedData.SM
 data.x2 <- YeastCellCycle2$designMatrix.SM
 n.clst  <- 8
 fit1	<- fit.CLMM.2(data.y1=data.y1,data.x1=data.x1,data.z1=data.x1,
                      data.y2=data.y2,data.x2=data.x2,data.z2=data.x2,
					  n.clst=n.clst)
 fit1.u  <- apply(fit1$u.hat, MARGIN=1, FUN=order, decreasing=TRUE)[1,]
 zeta.fitted <- fit1$theta.hat$zeta.hat
 profile.WT  <- YeastCellCycle2$designMatrix.WT[1,,] %*% t(zeta.fitted)
 profile.SM  <- YeastCellCycle2$designMatrix.SM[1,,] %*% t(zeta.fitted)
# remove bad time points for WTs and SMs
 n.time  <- 25
 time.WT <- (1:n.time)[-22]
 time.SM <- (1:n.time)[-c(6,9,12)]
 n.rpl.WT<- length(time.WT)
 n.rpl.SM<- length(time.SM)
 n.gene.short<-dim(YeastCellCycle2$normalizedData.WT)[1]
# gene-specific profile: observed profiles averaged over replicates
 data.WT.mean  <- matrix(0, nrow=n.gene.short, ncol=n.rpl.WT)
 data.SM.mean	<- matrix(0, nrow=n.gene.short, ncol=n.rpl.SM)
 for(j in 1:n.gene.short){
  data.WT.mean[j,] <- (YeastCellCycle2$normalizedData.WT[j,1,]+
                       YeastCellCycle2$normalizedData.WT[j,2,])/2
  data.SM.mean[j,] <- (YeastCellCycle2$normalizedData.SM[j,1,]+
                       YeastCellCycle2$normalizedData.SM[j,2,])/2
 }
# plot observed profiles by cluster -- wild type
 col.panel=rainbow(8)
 par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
 for(k in 1:n.clst){
  plot(5*(time.WT-1), profile.WT[,k], type="l", col=col.panel[k], ylim=c(-2,2),
       xlab="Time", ylab="Expression Value", main=paste("WT: cluster",k))
  i.plot <- (1:n.gene.short)[fit1.u==k]
  for(j in i.plot) lines(5*(time.WT-1), data.WT.mean[j,], lty=1)
  lines(5*(time.WT-1), profile.WT[,k], col=col.panel[k], lwd=2)
  text(125, -1.9, pos=2, paste(length(i.plot)," genes"))
}
# plot observed profiles by cluster -- single mutant
 par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
 for(k in 1:n.clst){
  plot(5*(time.SM-1), profile.SM[,k], type="l", col=col.panel[k], ylim=c(-2,2),
       xlab="Time", ylab="Expression Value", main=paste("SM: cluster",k))
  i.plot <- (1:n.gene.short)[fit1.u==k]
  for(j in i.plot) lines(5*(time.SM-1), data.SM.mean[j,], lty=1)
  lines(5*(time.SM-1), profile.SM[,k], col=col.panel[k], lwd=2)
  text(125, -1.9, pos=2, paste(length(i.plot)," genes"))
}
# plot fitted profiles by cluster
 par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
 for(k in 1:n.clst){ 
  plot(5*(time.WT-1), profile.WT[,k], type="l", ylim=c(-2,2), 
       xlab="Time", ylab="Expression Value", lwd=2)
  title(paste("Cluster", k))
  lines(5*(time.SM-1), profile.SM[,k], lty=3, lwd=2)
  if(k==1) legend(60, 2, c("WT", "SM"), lty=1:2, cex=0.8)
 }

## End(Not run)

LXQin/CORM documentation built on May 7, 2019, 7:58 a.m.