fit.CLM: Clustering of Linear Models Method

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/function_fit_clm_1u_sigmaK_simple_new.R

Description

Fit a CLM model for cross-sectional data.

Usage

1
  fit.CLM(data.y, data.x, n.clst, n.start = 1)

Arguments

data.y

matrix of gene expression data, data.y[j, i] for sample i and gene j.

data.x

matrix of sample covariates, data.x[i, p] for sample i and covariate p.

n.clst

an integer, number of clusters .

n.start

an integer used to get the starting value for the EM algorithm.

Details

This function implements the Clustering of Linear Models Method of Qin and Self (2006). This method clusters genes based on the estimated regression parameters that model the relation between gene expression and sample covariates.

Value

u.hat

a matrix containing the cluster membership probability for each gene, whose row names are genes and column names are clusters.

theta.hat

a list comprised of four components: zeta.hat, pi.hat, sigma2.hat, llh. They are described as below:

zeta.hat

a matrix with the estimated regression parameters with one row for each cluster.

pi.hat

a vector with the relative frequency for each cluster.

sigma2.hat

a vector of variance parameters.

llh

log likelihood for the model.

Author(s)

Li-Xuan Qin [email protected]

References

See Also

fit.CLM, fit.CLMM, fit.CLMM.2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#Example 1
 #test data
  data(BreastCancer)
  data.y <- BreastCancer$normalizedData
  data.x <- BreastCancer$designMatrix
 #fit the model
  n.clst <- 9
  fit1   <- fit.CLM(data.y, data.x, n.clst)
  fit1.u <- apply(fit1$u.hat, MARGIN=1, FUN=order, decreasing=TRUE)[1,]
 #display the results
  index.IDC <- which(data.x[,2]==0)
  index.ILC <- which(data.x[,2]==1)
  mean.IDC  <- apply(data.y[,index.IDC], MARGIN=1, FUN=mean, na.rm=TRUE)
  mean.ILC  <- apply(data.y[,index.ILC], MARGIN=1, FUN=mean, na.rm=TRUE)

  color  <- rainbow(n.clst)
  par(mai=c(1,1,0.5,0.1),cex.axis=0.8, cex.lab=1,mgp=c(1.5,0.5,0))
  plot((mean.IDC+mean.ILC)/2, 
       (mean.IDC-mean.ILC), 
       xlab="(IDC mean + ILC mean)/2",
       ylab="IDC mean - ILC mean",
       pch=paste(fit1.u),
       col=color[fit1.u],
       main=paste("K=",n.clst))
 
## Not run: 
#Example 2
 #test data
  data(miRTargetGenes)
  data.y <- miRTargetGenes$normalizedData
  data.x <- miRTargetGenes$designMatrix
 #fit the model
  n.clst <- 9
  n.start<- 20
  fit2  	 <- fit.CLM(data.y, data.x, n.clst, n.start)
  fit2.u   <- apply(fit2$u.hat, MARGIN=1, FUN=order, decreasing=TRUE)[1,]
  fit2.u.o <- factor(fit2.u, levels=c(1,5,6,7,4,8,2,9,3), labels=1:9)
  library(limma)
  plot.y   <- lmFit(data.y, data.x)$coef %*% cbind(c(1,0,0,0),c(1,0,1,0),c(1,1,0,0),c(1,1,1,1))
  plot.x   <- 1:4
 #display the results
  color		 <- rainbow(n.clst)
  par(mfrow=c(3,4),mai=c(0.35, 0.4, 0.4, 0.2), mgp=c(1.6,0.4,0), tck=-0.01, las=2)
  for(k in 1:n.clst){
   plot(plot.x, plot.y[1,], type="n", xaxt="n", ylim=range(plot.y), 
        xlab="", ylab="gene expression")
   axis(1, plot.x, c("Normal \n","Normal \n +miRNA","Tumor \n","Tumor \n +miRNA"), 
        las=1, cex.axis=1, mgp=c(1.5,1.2,0))
   title(paste("cluster", k))
   abline(h=0, lty=2)
   for(j in which(fit2.u.o==k)) points(plot.x, plot.y[j,], type="b", col=color[k])
  }

## End(Not run)

CORM documentation built on May 1, 2019, 8:09 p.m.