knitr::opts_chunk$set(tidy = FALSE, cache = TRUE, dev = "png",
                      message = FALSE, error = FALSE, warning = TRUE)

Introduction

The PBMExperiment class is the core structure defined in the upbm package for storing raw and normalized universal PBM data (see vignette("upbm-classes")). While the structure is useful for analysis and organization, often tabular data is much easier for computing quick summary statistics and performing exploratory analysis.

suppressPackageStartupMessages(library("upbm"))

Since performing exploratory analysis with data stored in PBMExperiment objects is a fairly common task, we have defined a method for converting PBMExperiment assay data to tabular format. This is implemented as n extension to the broom::tidy function originally defined in the broom package.

In this vignette, we demonstrate the various uses of the broom::tidy function with PBMExperiment objects using the example HOXC9 dataset from the upbmData package.

HOXC9 Dataset

For details on the example HOXC9 dataset, see the quick start vignette in this package or the upbmData package vignette. Here, we will just use Alexa488 scans.

data(hoxc9alexa, package = "upbmData")
hoxc9alexa

Tidy Data

"Tidy data" has become a popular and powerful framework for organizing data during interactive analysis. In the tidy data framework, data is organized as a data.frame with each row corresponding to an individual obervation or sample. Not only does the tidy data framework help keep data organized, but it also unlocks the powerful data parsing and visualization functions in the Tidyverse collection of packages.

To keep track of various probe and sample metadata compactly, uPBM data are not organized as tidy data. Instead, they are stored as PBMExperiment and PBMDesign objects which extend core Bioconductor data structures (see vignette("upbm-classes")). However, when performing interactive analysis, it can be useful to extract tidy data from the PBMExperiment objects.

The data for a single assay in PBMExperiment and SummarizedExperiment objects can be returned by passing the objects to broom::tidy.

broom::tidy(hoxc9alexa)

By default, the first assay in the object is returned as a wide tibble with columns corresponding to individual samples. Notice that the rowData are also included as columns in the tibble. Additionally, note that the number of rows is much smaller than the original PBMExperiment object.

The default behavior of broom::tidy is to perform any probe filtering and sequence trimming defined in the PBMDesign associated with the PBMExperiment object. In this case, probe sequences were trimmed to 36 nucleotides and all background and control probes were excluded. This filtering and trimming can be turned off by specifying process = FALSE.

broom::tidy(hoxc9alexa, process = FALSE)

The assay can also be specified.

broom::tidy(hoxc9alexa, assay = "back")

While returning a wide tibble maintains the original shape of the assay data, with tidy data, we often prefer each row to correspond to an single observation in "long" format. We can return a long tibble by specifying long = TRUE.

broom::tidy(hoxc9alexa, assay = "back", long = TRUE)

When long = TRUE, the column names are placed in a cname column of the tibble and the assay values are included in a column matching the assay name (here, back). In addition to the column name, assay values, and rowData, in long format, colData is also included in the output.

Tidying of multiple assays is also supported.

broom::tidy(hoxc9alexa, assay = c("fore", "back"))

When mutliple assays are specified, the data will be returned as a long tibble.

While we have described how to call broom::tidy with PBMExperiment objects, more generally, the function can also be applied to any SummarizedExperiment object.

se <- as(hoxc9alexa, "SummarizedExperiment")
broom::tidy(se, long = TRUE)

Notice that when calling broom::tidy on the SummarizedExperiment object, background probes are not filtered. Similarly, probe sequences are not trimmed. These features are unique to PBMExperiment objects and are lost when converting the data to a SummarizedExperiment object.



pkimes/upbm documentation built on Oct. 17, 2020, 9:10 a.m.