Related resources: Griffith lab course
library(airway) data(gse) gse # uses gencode gene names rownames(gse) = gsub("\\..*", "", rownames(gse)) # ENS library(org.Hs.eg.db) ss = select(org.Hs.eg.db, keys=rownames(gse), keytype="ENSEMBL", columns=c("GENENAME", "SYMBOL")) dr = which(duplicated(ss[,1]) | is.na(ss[,1])) ss = ss[-dr,] rownames(ss) = ss[,1] ok = intersect(rownames(ss), rownames(gse)) gse = gse[ok,] rownames(gse) = ss[rownames(gse),]$SYMBOL gse = gse[-which(is.na(rownames(gse))),]
We will work with a reduction of the information in the experiment, summing columns of the assay matrix. We want to present the information for rapid uptake in context. We could just use a table
data.frame(colsum=colSums(assay(gse)), donor=gse$donor, trt=gse$condition)
but the numbers are unwieldy, and patterns are hard to extract from the text.
Questions
What is the interpretation of colSums(assay(gse))
?
Execute [when we go live the results of code chunks in problems is suppressed]
plot(colSums(assay(gse))) text(1:8, colSums(assay(gse)), labels=paste0(gse$donor, "\n", gse$names, "\n", substr(gse$condition,1,3)), cex=.6)
Which of the following is not true?
The difference in average read counts for cells N052611 and N061011 was less than 120000 reads
Try
plot(colSums(assay(gse)), pch=" ", xlim=c(0,9), axes=FALSE, xlab=" ") text(colSums(assay(gse)), label=paste0(gse$donor, "\n", gse$condition), col=as.numeric(gse$donor), font=2) axis(2)
Two of the plotted data elements have defects. What graphical parameters can be used to achieve a better rendering?
Easiest answer: ylim
plot(colSums(assay(gse)), pch=" ", xlim=c(0,9), axes=FALSE, xlab=" ", ylim=c(1.25e7,3.2e7)) text(colSums(assay(gse)), label=paste0(gse$donor, "\n", gse$condition), col=as.numeric(gse$donor), font=2) axis(2)
Here we'll work with some RNA-seq data developed in the Cancer Genome Atlas.
suppressMessages({ library(curatedTCGAData) ex = curatedTCGAData(c("PRAD", "ACC"), assay="RNAseq2GeneNorm", dry.run=FALSE) }) ex cs = lapply(experiments(ex), function(x) colSums(assay(x)))
plot(density(cs[[2]])) lines(density(cs[[1]]), lty=2) legend(2.3e7, 4e-7, legend=c("type 1", "type 2"))
Question: What should you substitute for 'type 1' in the legend so that the tumor type on which the solid density trace is based appears in the legend? Answer should be either 'PRAD' or 'ACC'.
Question: Which axis records the depth of sequencing for the various experiments? x or y?
suppressPackageStartupMessages({ library(graph) library(SingleR) }) mon = MonacoImmuneData() labs = unique(c(mon$label.main, mon$label.fine)) myg = new("graphNEL", nodes=labs, edgemode="directed") myg2 = addEdge(mon$label.main, mon$label.fine, myg) myg2
Given a set of nodes, we want to extract a sliceGraph = function(g, n) { a = lapply(n, function(x) adj(g, x)) nn = unique(c(n, unlist(a))) subGraph(nn, g) } myg3 = sliceGraph(myg2, unique(mon$label.main)[1:4]) library(igraph) ig = igraph.from.graphNEL(myg3) plot(ig, vertex.size=0, label.cex=.05, label.dist=1) ```
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.