Overview of the tidySummarizedExperiment package

Lifecycle:maturing

Brings SummarizedExperiment to the tidyverse!

website: stemangiola.github.io/tidySummarizedExperiment/

Please also have a look at

library(knitr)
knitr::opts_chunk$set(warning=FALSE, message=FALSE)

Introduction

tidySummarizedExperiment provides a bridge between Bioconductor SummarizedExperiment [@morgan2020summarized] and the tidyverse [@wickham2019welcome]. It creates an invisible layer that enables viewing the Bioconductor SummarizedExperiment object as a tidyverse tibble, and provides SummarizedExperiment-compatible dplyr, tidyr, ggplot and plotly functions. This allows users to get the best of both Bioconductor and tidyverse worlds.

Functions/utilities available

SummarizedExperiment-compatible Functions | Description ------------ | ------------- all | After all tidySummarizedExperiment is a SummarizedExperiment object, just better

tidyverse Packages | Description ------------ | ------------- dplyr | Almost all dplyr APIs like for any tibble tidyr | Almost all tidyr APIs like for any tibble ggplot2 | ggplot like for any tibble plotly | plot_ly like for any tibble

Utilities | Description ------------ | ------------- tidy | Add tidySummarizedExperiment invisible layer over a SummarizedExperiment object as_tibble | Convert cell-wise information to a tbl_df

Installation

if (!requireNamespace("BiocManager", quietly=TRUE)) {
      install.packages("BiocManager")
  }

BiocManager::install("tidySummarizedExperiment")

From Github (development)

devtools::install_github("stemangiola/tidySummarizedExperiment")

Load libraries used in the examples.

library(ggplot2)
library(tidySummarizedExperiment)

Create tidySummarizedExperiment, the best of both worlds!

This is a SummarizedExperiment object but it is evaluated as a tibble. So it is fully compatible both with SummarizedExperiment and tidyverse APIs.

pasilla_tidy <- tidySummarizedExperiment::pasilla %>%
    tidy()

It looks like a tibble

pasilla_tidy

But it is a SummarizedExperiment object after all

Assays(pasilla_tidy)

Tidyverse commands

We can use tidyverse commands to explore the tidy SummarizedExperiment object.

We can use slice to choose rows by position, for example to choose the first row.

pasilla_tidy %>%
    slice(1)

We can use filter to choose rows by criteria.

pasilla_tidy %>%
    filter(condition == "untreated")

We can use select to choose columns.

pasilla_tidy %>%
    select(sample)

We can use count to count how many rows we have for each sample.

pasilla_tidy %>%
    count(sample)

We can use distinct to see what distinct sample information we have.

pasilla_tidy %>%
    distinct(sample, condition, type)

We could use rename to rename a column. For example, to modify the type column name.

pasilla_tidy %>%
    rename(sequencing=type)

We could use mutate to create a column. For example, we could create a new type column that contains single and paired instead of single_end and paired_end.

pasilla_tidy %>%
    mutate(type=gsub("_end", "", type))

We could use unite to combine multiple columns into a single column.

pasilla_tidy %>%
    unite("group", c(condition, type))

We can also combine commands with the tidyverse pipe %>%.

For example, we could combine group_by and summarise to get the total counts for each sample.

pasilla_tidy %>%
    group_by(sample) %>%
    summarise(total_counts=sum(counts))

We could combine group_by, mutate and filter to get the transcripts with mean count > 0.

pasilla_tidy %>%
    group_by(transcript) %>%
    mutate(mean_count=mean(counts)) %>%
    filter(mean_count > 0)

Plotting

my_theme <-
    list(
        scale_fill_brewer(palette="Set1"),
        scale_color_brewer(palette="Set1"),
        theme_bw() +
            theme(
                panel.border=element_blank(),
                axis.line=element_line(),
                panel.grid.major=element_line(size=0.2),
                panel.grid.minor=element_line(size=0.1),
                text=element_text(size=12),
                legend.position="bottom",
                aspect.ratio=1,
                strip.background=element_blank(),
                axis.title.x=element_text(margin=margin(t=10, r=10, b=10, l=10)),
                axis.title.y=element_text(margin=margin(t=10, r=10, b=10, l=10))
            )
    )

We can treat pasilla_tidy as a normal tibble for plotting.

Here we plot the distribution of counts per sample.

pasilla_tidy %>%
    tidySummarizedExperiment::ggplot(aes(counts + 1, group=sample, color=`type`)) +
    geom_density() +
    scale_x_log10() +
    my_theme

Session Info

sessionInfo()

References



Try the tidySummarizedExperiment package in your browser

Any scripts or data that you put into this service are public.

tidySummarizedExperiment documentation built on Nov. 8, 2020, 8:22 p.m.