Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/function_fit_clm_1u_sigmaK_simple_new.R
Fit a CLM model for cross-sectional data.
1 | fit.CLM(data.y, data.x, n.clst, n.start = 1)
|
data.y |
matrix of gene expression data, data.y[j, i] for sample i and gene j. |
data.x |
matrix of sample covariates, data.x[i, p] for sample i and covariate p. |
n.clst |
an integer, number of clusters . |
n.start |
an integer used to get the starting value for the EM algorithm. |
This function implements the Clustering of Linear Models Method of Qin and Self (2006). This method clusters genes based on the estimated regression parameters that model the relation between gene expression and sample covariates.
u.hat |
a matrix containing the cluster membership probability for each gene, whose row names are genes and column names are clusters. |
theta.hat |
a list comprised of four components: zeta.hat, pi.hat, sigma2.hat, llh. They are described as below: |
zeta.hat |
a matrix with the estimated regression parameters with one row for each cluster. |
pi.hat |
a vector with the relative frequency for each cluster. |
sigma2.hat |
a vector of variance parameters. |
llh |
log likelihood for the model. |
Li-Xuan Qin qinl@mskcc.org
Li-Xuan Qin and Steven G. Self (2006). The clustering of regression models method with applications in gene expression data. Biometrics 62, 526-533.
Li-Xuan Qin (2008). An integrative analysis of microRNA and mRNA expression - a case study. Cancer Informatics 6, 369-379.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 | #Example 1
#test data
data(BreastCancer)
data.y <- BreastCancer$normalizedData
data.x <- BreastCancer$designMatrix
#fit the model
n.clst <- 9
fit1 <- fit.CLM(data.y, data.x, n.clst)
fit1.u <- apply(fit1$u.hat, MARGIN=1, FUN=order, decreasing=TRUE)[1,]
#display the results
index.IDC <- which(data.x[,2]==0)
index.ILC <- which(data.x[,2]==1)
mean.IDC <- apply(data.y[,index.IDC], MARGIN=1, FUN=mean, na.rm=TRUE)
mean.ILC <- apply(data.y[,index.ILC], MARGIN=1, FUN=mean, na.rm=TRUE)
color <- rainbow(n.clst)
par(mai=c(1,1,0.5,0.1),cex.axis=0.8, cex.lab=1,mgp=c(1.5,0.5,0))
plot((mean.IDC+mean.ILC)/2,
(mean.IDC-mean.ILC),
xlab="(IDC mean + ILC mean)/2",
ylab="IDC mean - ILC mean",
pch=paste(fit1.u),
col=color[fit1.u],
main=paste("K=",n.clst))
## Not run:
#Example 2
#test data
data(miRTargetGenes)
data.y <- miRTargetGenes$normalizedData
data.x <- miRTargetGenes$designMatrix
#fit the model
n.clst <- 9
n.start<- 20
fit2 <- fit.CLM(data.y, data.x, n.clst, n.start)
fit2.u <- apply(fit2$u.hat, MARGIN=1, FUN=order, decreasing=TRUE)[1,]
fit2.u.o <- factor(fit2.u, levels=c(1,5,6,7,4,8,2,9,3), labels=1:9)
library(limma)
plot.y <- lmFit(data.y, data.x)$coef %*% cbind(c(1,0,0,0),c(1,0,1,0),c(1,1,0,0),c(1,1,1,1))
plot.x <- 1:4
#display the results
color <- rainbow(n.clst)
par(mfrow=c(3,4),mai=c(0.35, 0.4, 0.4, 0.2), mgp=c(1.6,0.4,0), tck=-0.01, las=2)
for(k in 1:n.clst){
plot(plot.x, plot.y[1,], type="n", xaxt="n", ylim=range(plot.y),
xlab="", ylab="gene expression")
axis(1, plot.x, c("Normal \n","Normal \n +miRNA","Tumor \n","Tumor \n +miRNA"),
las=1, cex.axis=1, mgp=c(1.5,1.2,0))
title(paste("cluster", k))
abline(h=0, lty=2)
for(j in which(fit2.u.o==k)) points(plot.x, plot.y[j,], type="b", col=color[k])
}
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.