knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(vic3PCD) suppressMessages(library(DESeq2)) suppressMessages(library(dplyr)) suppressMessages(library(RColorBrewer)) suppressMessages(library(ggplot2)) suppressMessages(library(kableExtra))
When we get sequencing data our first question is there a difference in gene expression between Control and Treatment.
Before we dive into specic genes and probable molecular mechanisms we must evaluate the effect of the experimental conditions on gene expression.
In this step we look at expression counts from a sample perspective. Basically, we ask a question how different our samples are?
In this analysis we have 2 types of samples:
In Differential expression analysis we using: - 4 Control samples: P74-3, EP155, DZ-66 and P74-3+DZ-66 - 1 Test sample: P74-3+EP155
The SE object contains count table and associated metadata.
se <- readRDS(system.file("extdata", "se_vic3_2020.RData", package = "vic3PCD"))
To create your own metadata table and add it to SE object Create a data frame where row names correspond to sample names as in SE object.
tbl <- as.data.frame(colData(se)) kable(tbl, escape = F, linesep = "", booktabs = T, longtable = F, caption = "Summarized Experiment Metadata") %>% kable_styling(latex_options = c("scale_down", "repeat_header"), bootstrap_options = c("condensed"), font_size = 12, full_width = F)
Now that we have counts and metadata we proceed with Differential expression analysis.
dds <- DESeqDataSet(se, design = ~ condition) # Collapse replicas dds <- collapseReplicates(dds, dds$exset) # To make sure we have right category used as reference in the analysis dds$condition <- relevel(dds$condition, ref = "Control") # DESeq analysis dds <- DESeq(dds, test = "Wald", sfType = "poscounts", useT = FALSE, minReplicatesForReplace = 7) # Filter genes with more than 10 aligned reads keep <- rowSums(counts(dds)) >= 10 dds <- dds[keep,] # Normised counts rld <- rlog(dds, blind = F)
PCA allows to get an overall view on gene expression between samples. If PC1 and PC2 combined variance is over 50% and clusters are clearly separated we can proceed to following analysis.
# Manual color clr = c("dodgerblue3", "coral3") names(clr) <- unique(colData(rld)[,"condition"]) vic3_plotPCA(counts = assay(rld), group_vector = colData(rld)[,"condition"]) + scale_colour_manual(values=clr, name = "Condition:") + geom_text(label = colData(rld)[,"exset"], size = 3, colour = "black", fontface = "italic", hjust = 0, nudge_x = 3) + xlim(-20,60) + ylim(-25,25)
Raw counts distribution
vic3_boxplot(counts = assay(rld))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.