fit.CLMM.2: Clustering of Linear Mixed Models Method
In LXQin/CORM: The Clustering of Regression Models Method

Description Usage Arguments Details Value Author(s) References See Also Examples

Fit a CLMM model for replicated time course data allowing for two sets of time points among biological or technical replicates. Missing value are allowed.

1	fit.CLMM.2(data.y1,data.x1,data.z1,data.y2,data.x2,data.z2,n.clst, n.run = 1)

`data.y1`	a three dimensional array of gene expression data, data.y1[j, i, t] for gene j, sample i, and time point t. Missing values should be indicated by "NA". And at least one case not missing in each pair of observations.
`data.x1`	a three dimensional array of fixed effects (common for all genes), data.x1[i, t, p] for sample i, time point t, and covariate p.
`data.z1`	a three dimensional array of random effects (common for all genes), data.z1[i, t, q] for sample i, time point t, and covariate p.
`data.y2`	a three dimensional array of gene expression data, data.y2[j, i, t] for gene j, sample i, and time point t. Missing values should be indicated by "NA". And at least one case not missing in each pair of observations.
`data.x2`	a three dimensional array of fixed effects (common for all genes), data.x2[i, t, p] for sample i, time point t, and covariate p.
`data.z2`	a three dimensional array of random effects (common for all genes), data.z2[i, t, q] for sample i, time point t, and covariate p.
`n.clst`	an integer, number of clusters.
`n.run`	an integer used to get the starting value for the EM algorithm.

This function implements the Clustering of Linear Mixed Models Method of Qin and Self (2006).

`u.hat`	a matrix containing the cluster membership for the genes.
`b.hat`	an array containing the estimated random effects.
`theta.hat`	a list comprised of five components: zeta.hat, pi.hat, D.hat, sigma2.hat and llh. They are described as below:
`zeta.hat`	a matrix of the estimated fixed effects with one row for each cluster.
`pi.hat`	a vector of the relative frequency for each cluster.
`D.hat`	the estimated random effects variances for the clusters.
`sigma2.hat`	the estimated measurement error variances for the clusters.
`llh`	log likelihood for the model.

Li-Xuan Qin qinl@mskcc.org

Li-Xuan Qin and Steven G. Self (2006). The clustering of regression models method with applications in gene expression data. Biometrics 62, 526-533.
Li-Xuan Qin, Linda Breeden and Steven G. Self (2014). Finding gene clusters for a replicated time course study. BMC Res Notes 7:60.

fit.CLM, fit.CLMM, fit.CLMM.2

## Not run: 
#test data
 data(YeastCellCycle2)
 data.y1 <- YeastCellCycle2$normalizedData.WT
 data.x1 <- YeastCellCycle2$designMatrix.WT
 data.y2 <- YeastCellCycle2$normalizedData.SM
 data.x2 <- YeastCellCycle2$designMatrix.SM
 n.clst  <- 8
 fit1	<- fit.CLMM.2(data.y1=data.y1,data.x1=data.x1,data.z1=data.x1,
                      data.y2=data.y2,data.x2=data.x2,data.z2=data.x2,
					  n.clst=n.clst)
 fit1.u  <- apply(fit1$u.hat, MARGIN=1, FUN=order, decreasing=TRUE)[1,]
 zeta.fitted <- fit1$theta.hat$zeta.hat
 profile.WT  <- YeastCellCycle2$designMatrix.WT[1,,] %*% t(zeta.fitted)
 profile.SM  <- YeastCellCycle2$designMatrix.SM[1,,] %*% t(zeta.fitted)
# remove bad time points for WTs and SMs
 n.time  <- 25
 time.WT <- (1:n.time)[-22]
 time.SM <- (1:n.time)[-c(6,9,12)]
 n.rpl.WT<- length(time.WT)
 n.rpl.SM<- length(time.SM)
 n.gene.short<-dim(YeastCellCycle2$normalizedData.WT)[1]
# gene-specific profile: observed profiles averaged over replicates
 data.WT.mean  <- matrix(0, nrow=n.gene.short, ncol=n.rpl.WT)
 data.SM.mean	<- matrix(0, nrow=n.gene.short, ncol=n.rpl.SM)
 for(j in 1:n.gene.short){
  data.WT.mean[j,] <- (YeastCellCycle2$normalizedData.WT[j,1,]+
                       YeastCellCycle2$normalizedData.WT[j,2,])/2
  data.SM.mean[j,] <- (YeastCellCycle2$normalizedData.SM[j,1,]+
                       YeastCellCycle2$normalizedData.SM[j,2,])/2
 }
# plot observed profiles by cluster -- wild type
 col.panel=rainbow(8)
 par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
 for(k in 1:n.clst){
  plot(5*(time.WT-1), profile.WT[,k], type="l", col=col.panel[k], ylim=c(-2,2),
       xlab="Time", ylab="Expression Value", main=paste("WT: cluster",k))
  i.plot <- (1:n.gene.short)[fit1.u==k]
  for(j in i.plot) lines(5*(time.WT-1), data.WT.mean[j,], lty=1)
  lines(5*(time.WT-1), profile.WT[,k], col=col.panel[k], lwd=2)
  text(125, -1.9, pos=2, paste(length(i.plot)," genes"))
}
# plot observed profiles by cluster -- single mutant
 par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
 for(k in 1:n.clst){
  plot(5*(time.SM-1), profile.SM[,k], type="l", col=col.panel[k], ylim=c(-2,2),
       xlab="Time", ylab="Expression Value", main=paste("SM: cluster",k))
  i.plot <- (1:n.gene.short)[fit1.u==k]
  for(j in i.plot) lines(5*(time.SM-1), data.SM.mean[j,], lty=1)
  lines(5*(time.SM-1), profile.SM[,k], col=col.panel[k], lwd=2)
  text(125, -1.9, pos=2, paste(length(i.plot)," genes"))
}
# plot fitted profiles by cluster
 par(mai=c(0.3,0.25,0.2,0.05),mfrow=c(3,3))
 for(k in 1:n.clst){ 
  plot(5*(time.WT-1), profile.WT[,k], type="l", ylim=c(-2,2), 
       xlab="Time", ylab="Expression Value", lwd=2)
  title(paste("Cluster", k))
  lines(5*(time.SM-1), profile.SM[,k], lty=3, lwd=2)
  if(k==1) legend(60, 2, c("WT", "SM"), lty=1:2, cex=0.8)
 }

## End(Not run)