knitr::opts_chunk$set(collapse = TRUE, 
                      comment = "#>")

Make sure you did all the previous steps explained in the README file. Install the qpcrviia7 package and the other required packages. Make sure your data file and annotation files are correct. You can use this example as a template to make your own script for analysis. Rmarkdown makes it possible to run the analysis in chunks to understand the individual steps better and can knit everything together in an html file.

Load libraries

library(qpcrviia7)
library(readxl)
library(dplyr)
library(tidyr)
library(ggplot2)
library(ggthemes)
library(RColorBrewer)
library(Hmisc)

Prepare files

Import data

Data needs to be imported into the R session, we will use our own functions for this matter. read_qpcr() reads the raw excel file obtained from the Viia7 after qPCR and read_annotation reads our annotation files.

Import raw data with read_qpcr()

qpcr <- read_qpcr("data/2017-10-05 FAP-C2C12-Muscle qPCR validation MC.xls")

Import annotation with read_annotation()

ann_samples <- read_annotation("data/Annotations_samples_124839_technical_replicates.xlsx")
ann_genes <- read_annotation("data/Annotations_primers_124839_technical_replicates.xlsx")

Quality control

Technical replicates

Before merging our data and annotation files we first check the quality of our technical replicates.

List bad technical replicates with list_bad_tech_rep()

This function prints a list of all technical replicates above a certain standard deviation. The standard deviation is put to 0.4 but you change it inside the function under argument threshold.

list_bad_tech_rep(qpcr, threshold = 0.4)

Merge technical replicates with qc_tech_rep()

After setting the right threshold we merge all our technical replicates together by taking the average with qc_tech_rep(). The standard deviation is put to 0.4 but you change it inside the function under argument threshold.

qpcr <- qc_tech_rep(qpcr, threshold = 0.4)
#remove Avg from col names for further functions
colnames(qpcr) <- c("Target", "Sample", "CT", "SD_CT", "Tm1", "SD_Tm1", "Tm2", "SD_Tm2")

Merge and tidy files with join_and_clean_qpcr()

After loading in the files we can see that they are still untidy. We will merge all files into one big dataframe and change the undetermined values in the CT column to NA so that our data is ready for analysis.

join_and_clean_qpcr(qpcr)

CT

To start we plot all CT values to check the distribution and spot eventual outliers.

par(mfrow=c(1,1))
hist(qpcr$CT, main = "CT values")

Samples

We plot the CT values per Sample to check the distribution per Sample and to spot outliers. You should also be able to see the blanc sample or if the blanc was totally not contaminated you won't be able to see it. Bad Samples can be removed with remove_sample(). Check the minimal level of CT values to select it later in the function set_min_max_CT(). The value of the blanc sample can be used to set the maximum level of CT.

ggplot(qpcr, aes(Sample, CT)) +
  geom_boxplot() +
  theme(axis.text.x = element_text(angle = 90)) +
  ggtitle("CT values per sample")

Primers

We plot the CT values per Gene to check the distribution per Gene and to spot outliers. Bad Genes can be removed with remove_primer().

ggplot(qpcr, aes(Gene, CT)) +
  geom_boxplot() +
  theme(axis.text.x = element_text(angle = 90)) +
  ggtitle("CT values per primer")

Remove blanc sample with remove_blanc()

After we checked the blanc we don't need it anymore in further analysis so it needs to be removed. We use the function remove_blanc() for this.

qpcr <- remove_blanc(qpcr)
hist(qpcr$CT, main = "CT after removal blanc")

Separate endogenous from housekeeping genes with split_genes()

Now we need to seperate our data into a set of housekeeping genes and endogenous genes. Housekeeping genes will be saved in a dataframe called hkg and the endogenous genes in a dataframe called endog. We need to this to be able to run the qc_hkg() function on the housekeeping genes.

split_genes(qpcr)

Set base CT Value with set_min_max_CT()

Undetermined CT values are also biological information, with set_min_max_CT() we put the lowest CT values at 10 and the highest at 40. Every CT that is below 10 or undetermined will become 40.

endog <- set_min_max_CT(endog, CT = "CT", 10, 40)
hist(endog$CT, main = "CT after set baseline")

Melt curve analysis

Print a list of all genes with double meltcurve peaks with list_double_meltcurves()

list_double_meltcurves(endog)

Remove bad meltcurves with remove_bad_meltcurves()

remove_bad_meltcurves()

Housekeeping genes

Quality control per housekeeping gene with qc_hkg()

Now we will perform some quality control measures on our housekeeping genes. We recommend to use three housekeeping genes that show stable expression accross samples. the qc_hkg() function will check the distrubtion of CT values between samples of all housekeeping genes seperately. Here, we can see if the variability of any of the housekeeping genes is too high.

qc_hkg(hkg)

Remove irregular housekeeping genes with remove_primer()

In this example Rpl13a shows a binomial expression throughout the dataset. This is because we compare cell lines in vitro with skeletal muscle. Therefore, we remove this housekeeping gene with remove_primer()

hkg <- remove_primer(hkg, "Rpl13a")

Take geometric average over all housekeeping genes with calculate_mena_hkg()

When we assessed the quality of our housekeeping genes seperately it's time to take the average of these genes.

mean_hkg <- calculate_mean_hkg(hkg)

Quality control over average of housekeeping genes with qc_mean_hkg()

After taking the average we plot quality control measures again to check for outliers.

qc_mean_hkg(mean_hkg)

Remove outliers with remove_outliers()

Outliers are defined by boxplots, every sample that is outside 1.5 * IQR is regarded as an outliers and can skew analysis of that data. Therefore, we remove these with the function remove_outliers()

mean_hkg$CT_avg_hkg <- remove_outliers(mean_hkg$CT_avg_hkg)

Rerun remove_outliers() when new outliers pop up

In this example new outliers pop up after a first outlier removal. It is possible to rerun qc_mean_hkg() and remove_outliers() to remove additional outliers.

qc_mean_hkg(mean_hkg)
mean_hkg$CT_avg_hkg <- remove_outliers(mean_hkg$CT_avg_hkg)
shapiro.test(mean_hkg$CT_avg_hkg)

Delta CT

When quality control is done we can normalize over housekeeping genes. So far this package can calculate Delta CT and Relative expression compared to housekeeping genes. Functions calculating fold change will be incorporated soon.

Calculate Delta CT with calculate_DCT()

The function calculate_DCT calculates Delta CT and relative expression. Make sure to select your annotation columns that are important for plotting! Fill them in under the cols argument of calculate_DCT(). In this example we selected Cell_type, "Condition" and "Mouse".

calculate_DCT(endog, mean_hkg, cols = c("Cell_type", "Condition", "Mouse"))

Plot Delta CT

All genes with plot_scatter()

Make a summary plot of the Delta CT values of all genes and samples together. Therefore, you need to fill in the x.var, y.var and col.var variable in function plot_scatter(). As x.var we use Cell_type to plot the different cell types over the x axis, y.var is DCt to plot Delta CT values on the y axis and col.var is Condition this is because we are comparing between healthy and dystrohpic cells. The function automatically makes a plot per Gene in the data set.

plot_scatter(endog, x.var = "Cell_type", y.var = "DCt", col.var = "Condition", theme = 8) 

Select a gene with plot_scatter_per_gene()

You can also plot a specific gene with plot_scatter_per_gene(), you can select the gene under the gene argument.

plot_scatter_per_gene(endog, gene = "Dcn", x.var = "Cell_type", y.var = "DCt", col.var = "Condition", theme = 10) 

Plot relative expression

We can also plot the relative expression towards the housekeeping genes. We generate the same plots as under Delta CT. The only difference is that we use rel_expr as argument for y.var.

All genes with plot_scatter()

plot_scatter(endog, x.var = "Cell_type", y.var = "rel_expr", col.var = "Condition", theme = 8) +
  labs(title = "Scatterplot all genes", y = "Relative expression", x = "Cell type")

Select a gene with plot_scatter_per_gene()

plot_scatter_per_gene(endog, gene = "Dcn", x.var = "Cell_type", y.var = "rel_expr", col.var = "Condition", theme = 10) 

Session info

sessionInfo()


SCIL-leuven/qpcrviia7 documentation built on May 21, 2019, 2:31 a.m.