knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Here, you will learn how to scale, inspect and visualize the gene expression data that you have parsed in the previous vignette (Load and Normalize the qPCR Output).
library(fluidgr) library(magrittr) library(dplyr) library(stringr) library(ggplot2) library(forcats) library(scales) library(devtools)
We provide the fluidigm_expr
object, that contains normalized gene expression measurement.
This object is exactly the same at the one produced in the vignette (Load and Normalize the qPCR Output).
You can load it by just typing fluidigm_expr
:
fluidigm_expr
When we explore medium sized gene expression datasets, such as the one produced by qPCR, we often just concentrate on statistical testing and statistical relevance. But in this kind of datasets, you can observe many gene expression patterns and behaviour by Exploratory Data Analysis, when you iteratively scale and plot your data.
Scaling and plotting are tightly connected, because plots can help you figure out how scaling transforms your expression data; and appropriate scaling can help you detect useful expression patterns in your plots.
Make the variables in your data explicit.
Often medium sized qPCR gene expression measurememtns are multivatiate. If you want to visualizalize them effectively, make all variables explicit. Also, keep in mind which variable you suspect that determines the response in your dataset and eventually what is the question that you are trying to anwser.
For example in the sample dataset we are measuring the expression of four genes in 4 developmental stages in 5 species.
The column target_name
records the "gene" variable, while the other two variables, developmental stage and species, are hidden in the sample_name
variable. We can make them explicit.
The sample_name
variable is encoded as "J-N1R1":
# make variable explicit fluidigm_expr <- fluidigm_expr %>% # make species explicit mutate(species = str_split_fixed(string = sample_name, pattern = "-", n = 2)[, 1]) %>% # make stage explicit mutate(stage = str_sub(string = sample_name, start = 3, end = 4)) %>% # make replicate explicit mutate(replicate = str_sub(string = sample_name, start = 5, end = 6))
When you want want to explore gene expression, scaling is not always necessary, and often you can use, explore, visualize and communicate directly the normalized expression values; stored in the expression
column of the fluidigm_expr
dataset.
Indeed scaling is not always necessary, but, together with exploratory plots, scaling can help you detect patterns in your data.
Exploratory plots can help understand the effect that scaling has on expression data. This is why we explain scaling and visualisation of expression data together.
scale_fluidigm()
functionFluidgr provides the scale_fluidigm()
function, that scales expression values:
scaled_expression
),scaled_01_expression
),according to the groups provided in the argument .group
.
This function takes as input the output of normalize()
and returns the same dataset with the two extra column scaled_expression
.
scaled_dat <- fluidigm_expr %>% scale_fluidigm(.group = target_name) # inspect the output scaled_dat %>% select(scaled_expression, scaled_01_expression) %>% summary()
You can write your own scaling function, becaus scale_fluidigm()
function is just a soft wrapper around dplyr's group_by()
and mutate()
. If you want to try different scaling methods, you can replicate and tweak the output of scale_fluidigm()
with:
fluidigm_expr %>% group_by(target_name) %>% mutate(scaled_expression = scale(expression))
This should provide to you a good basis and enough freedom to explore and visualize your data as you prefer.
Fluidigmr does not provide plotting function, because all the functions that you need to plot this kind of data are already available in ggplot2. Indeed the dataset output of is already in a tidy/gathered format.
Here we provide some ideas of how you can explore, plot and visualize your Fluidigm data.
A lineplot can be informative directly on normalized data, without scaling.
You can use facets to display multivariate data.
p <- fluidigm_expr %>% ggplot(aes(x = stage, y = expression)) + # plot each expression value into a point geom_point(size = 2, col = "#DA614D", alpha = .8) + # it might be complicated to add a line that connect means # across stages, # because stages are encoded as a categorical variable, # we must turn them into numeric stat_summary(aes(x = stage %>% as.factor(.) %>% as.numeric(.)), fun.y = mean, geom="line", size = 1.2, alpha = .7, linejoin = "round") + # implement facetting with independent y values # for each genes, since expression values are # not comparable across genes facet_grid(target_name ~ species, scales = "free_y") + theme_bw() p %>% print()
Some patterns in your data might be more evident if you plot them in log scale.
p_log <- p + scale_y_log10() p_log %>% print()
Also using heatmap with ggplot2
p_heat <- fluidigm_expr %>% ggplot(aes(x = stage, y = target_name, fill = expression)) + geom_tile() + facet_grid(. ~ species) + scale_fill_viridis_c() p_heat %>% print()
Sometime heatmap don't convey muich invormation if you don't scale your data.
For example we can scale every gene on z-score to make them comparable.
scaled_gene <- fluidigm_expr %>% group_by(target_name) %>% mutate(scaled_expression = expression %>% rescale(from = range(.), to = c(0,1)))
Check if the heatmap has improved
remeber to map the colours to the new scaled_expression
values
p_heat <- scaled_gene %>% ggplot(aes(x = stage, y = species, fill = scaled_expression)) + geom_tile() + facet_grid(. ~ target_name) + scale_fill_viridis_c() p_heat %>% print()
We can try to set more groups.
scaled_dat <- fluidigm_expr %>% group_by(target_name, species) %>% mutate(scaled_expression = expression %>% rescale(from = range(.), to = c(0,1)))
Check if the heatmap has improved
remeber to map the colours to the new scaled_expression
values
p_heat <- scaled_dat %>% ggplot(aes(x = stage, y = species, fill = scaled_expression)) + geom_tile() + facet_grid(. ~ target_name) + scale_fill_viridis_c() p_heat %>% print()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.