\newpage

# treatment of count BBRIC file
count <- paste(opt$count_dir, basename(opt$count), sep = .Platform$file.sep)
bbric.df <- read.table(count, header = TRUE, sep="\t", check.names=FALSE)
nbcoltotal=dim(bbric.df)[2]
nblib=(nbcoltotal - 8)/2
rawvalues = bbric.df[, seq(9, nbcoltotal, by=2)]
colnames(rawvalues) = gsub(".count", "", colnames(rawvalues)) 
rpkmvalues = bbric.df[, seq(10, nbcoltotal, by=2)]
colnames(rpkmvalues) = gsub(".rpkm", "", colnames(rpkmvalues)) 
lograwvalues <- log(rawvalues + 1)
logrpkmvalues <- log(rpkmvalues + 1)

# treatment of design file
if (!is.null(opt$design)) {
  design <- paste(opt$design_dir, basename(opt$design), sep = .Platform$file.sep)
    design.df <- read.table(design, header=TRUE, sep="\t", check.names=FALSE)
    lib_names <- design.df[,1]
    Factor = design.df[,2]
    moda_names = unique (Factor)
    testcol  <- colorRampPalette(c('blue','red', 'green', 'orange'))
    colmoda <- testcol(length(moda_names))
    colorModa <- c()
    for (i in 1:length(moda_names)) {
        moda = moda_names[i]
        colorModa[moda]=colmoda[i]
    }
    colorLib <- c()
    for (i in 1:length(lib_names)) {
        lib = lib_names[i]
        moda = Factor[i]
        collib = colorModa[moda]
        colorLib[lib]=collib
    }
}

Cluster dendrogram representing the hierarchical clustering of libraries (log raw values)

Input Data = RNASeq count expression file.

After a log transformation and a Pearson correlation, data are submitted to an Ascending Hierarchical Clustering (AHC) (Ward).

The aim of AHC is to cluster the libraries, based on their count values.

Expected results :

Application :

dist.lame.cor <- 1 - cor(lograwvalues, use="pairwise.complete.obs")
hc.lame.cor <- hclust(as.dist(dist.lame.cor))
dendCol <- as.dendrogram(hc.lame.cor)

if (!is.null(opt$design)) {
    labels_colors(dendCol) <- colorModa[Factor][order.dendrogram(dendCol)]
}
plot(dendCol, main= "Cluster Dendrogram (log raw values)")

\newpage

Cluster dendrogram representing the hierarchical clustering of libraries (log rpkm values)

Input Data = RNASeq count expression file.

After a log transformation and a Pearson correlation, data are submitted to an Ascending Hierarchical Clustering (AHC) (Ward).

The aim of AHC is to cluster the libraries, based on their count values.

Expected results :

Application :

dist.lame.cor <- 1 - cor(logrpkmvalues, use="pairwise.complete.obs")
hc.lame.cor <- hclust(as.dist(dist.lame.cor))
dendCol <- as.dendrogram(hc.lame.cor)

if (!is.null(opt$design)) {
    labels_colors(dendCol) <- colorModa[Factor][order.dendrogram(dendCol)]
}
plot(dendCol, main= "Cluster Dendrogram (log rpkm values)")

\newpage

Barplot representing for each library the number of features (genes / RNA) that have at least 10 reads mapped

For each library, the number of genes that have at least 10 reads mapped, is displayed.

The number of 10 is empirical, and was chosen because we think that it is the minimal number required to consider that a feature is expressed.

Expected results :

Application :

vecteursup10 = c()
vecteurunionprov = c()
vecteurall = c()
for (colo in 1:nblib) { vecteursup10 <- append(vecteursup10, length(which(rawvalues[,colo]>9)))}
for (colo in 1:nblib) { vecteurunionprov <- append(vecteurunionprov, which(rawvalues[,colo]>9))}
vecteurunion = unique(vecteurunionprov)
vecteursup10 <- append(vecteursup10, length(vecteurunion))
for (colo in 1:(nblib+1)) { vecteurall <- append(vecteurall, length(rawvalues[,1]))}

if (!is.null(opt$design)) {
    barplot(vecteursup10, names.arg=c(colnames(rawvalues[1:nblib]), "At least in 1 lib"), las=2, cex.names=0.7, col=c(colorLib, "black"), ylim=c(0,max(vecteurall)))
} else {
    barplot(vecteursup10, names.arg=c(colnames(rawvalues[1:nblib]), "At least in 1 lib"), las=2, cex.names=0.7, ylim=c(0,max(vecteurall)))
}

todisplay = data.frame(c(colnames(rawvalues[1:nblib]), "At least in 1 lib"),vecteursup10, vecteurall)
names(todisplay) = c('lib', 'nb features with at least 10 reads', 'nb features total')
todisplay[,1:3]

\newpage

Box-plot (log raw values)

Input Data = RNASeq count expression file.

For each library, box-plots of log (count values) are displayed.

Expected results :

Application :

if (!is.null(opt$design)) {
    boxplot(lograwvalues, las=2, cex.axis=0.7, main="BoxPlot of log (raw counts)", col=colorLib)
} else {
    boxplot(lograwvalues, las=2, cex.axis=0.7, main="BoxPlot of log (raw counts)")
}

\newpage

Box-plot (log rpkm values)

Input Data = RNASeq count expression file.

For each library, box-plots of log (count values) are displayed.

Expected results :

Application :

if (!is.null(opt$design)) {
    boxplot(logrpkmvalues, las=2, cex.axis=0.7, main="BoxPlot of log (rpkm counts)", col=colorLib)
} else {
    boxplot(logrpkmvalues, las=2, cex.axis=0.7, main="BoxPlot of log (rpkm counts)")
}

\newpage

Summary of raw values

summary(rawvalues)

\newpage

Summary of rpkm values

summary(rpkmvalues)


jos4uke/qc4rnaseq documentation built on May 19, 2019, 8:45 p.m.