knitr::opts_chunk$set(tidy = FALSE, cache = TRUE, dev = "png",
                      message = FALSE, error = FALSE, warning = TRUE)

Introduction

For an introduction to the upbm package, please see the quick start vignette (vignette("upbm-quickstart")). Here, we provide details on the two classes introduced in the package, PBMDesign and PBMExperiment. The PBMDesign class acts as a container for protein binding microarray (PBM) design information, while the PBMExperiment class serves as the container for PBM experimental data.

suppressPackageStartupMessages(library("upbm"))

PBMDesign objects for standard universial PBM (uPBM) designs are included in the upbmAux package. Additionally, example PBMExperiment objects are included in the upbmData package containing data from uPBM experiments performed using the HOXC9 transcription factor. We use these objects to help illustrate the simple structure of each class.

suppressPackageStartupMessages(library("upbmAux"))
suppressPackageStartupMessages(library("upbmData"))

PBMDesign

The PBMDesign class is used to organize the probe information for PBM experiments. PBMDesign objects are composed of 3 components (slots):

We can take a look at the pbm_8x60k_v1 design object from the upbmAux package to get a better sense of these components.

data(pbm_8x60k_v1, package = "upbmAux")
pbm_8x60k_v1

From the object description, we can see that the design contains r nrow(design(pbm_8x60k_v1)) rows and r ncol(design(pbm_8x60k_v1)) columns. Each row is a probe in the PBM array. Two columns, "probeID" and "Sequence" must be defined for the object to be a valid PBMDesign. The other two columns, "Row" and "Column" are optional information which are needed to perform spatial adjustment with upbm::spatiallyAdjust.

head(design(pbm_8x60k_v1))

In addition to dsDNA probes which measure protein binding, PBMs typically include control probes designed by Agilent or the user. These probes should be excluded for certain normalization and downstream analysis steps (e.g. Cy3 normalization and spatial adjustment).

The probeFilter slot is a list of functions which are used to distinguish target probes from control probes. The list must be named with names matching columns in the design shown above. We can get and set the probeFilter slot by calling upbm::probeFilter.

names(probeFilter(pbm_8x60k_v1))

Here, the PBMDesign object includes a single probe filtering rule associated with the "probeID" column.

probeFilter(pbm_8x60k_v1)$probeID

The rule must be a function that takes the specified column ("probeID") as input and returns a logical vector of the same length. Only rows which return TRUE for all rules are kept for downstream analysis. In the pbm_8x60k_v1 design, the single filter checks the probe IDs and returns TRUE for the subset prefixed with dBr_, corresponding to the subset of de Bruijn sequence probes on the array. Filtering can be performed using upbm::pbmFilterProbes.

pbm_8x60k_filtered <- pbmFilterProbes(pbm_8x60k_v1)
pbm_8x60k_filtered

Notice that the returned PBMDesign has fewer probes than the original design. We can explicitly verify that this smaller number of probes matches the number of probes containing the dBr prefix in the "probeID" column.

table(grepl("^dBr", design(pbm_8x60k_v1)$probeID))

After filtering has been performed, the probeFilter slot of the returned PBMDesign is cleared to prevent iterative application of the filtering critera.

Finally, the probeTrim slot is an optional numeric vector of length 2. Again, this can be accessed and set using the upbm::probeTrim function.

probeTrim(pbm_8x60k_filtered)

Revisiting the design of the probes, notice that the probe sequences all end in a common 24nt primer sequence.

head(design(pbm_8x60k_filtered)$Sequence)

The probeTrim slot specifies the start and end positions in the sequences that should be kept to map K-mers to probes. By default, this is set to (1, 36) in the uPBM designs in upbmAux to exclude the common 24nt primer sequence at the end of all probes. While the values may be modified to include some or all of the primer sequence, this has not been extensively evaluated. Probe trimming can be performed using upbm::pbmTrimProbes.

pbm_8x60k_trimmed <- pbmTrimProbes(pbm_8x60k_filtered)
head(design(pbm_8x60k_trimmed))

Modifying Objects

All of the slots described above can be modified or updated. For example, we can add an additional (redundant) rule to the probeFilter list of the example PBMDesign.

probeFilter(pbm_8x60k_v1) <- c(probeFilter(pbm_8x60k_v1),
                               Sequence = function(x) { x != "#N/A" })

We can also easily change the trimming boundaries.

probeTrim(pbm_8x60k_filtered) <- c(1, 50)
pbm_8x60k_filtered

New Objects

A new PBMDesign object can be created by passing a data.frame of probe designs to the upbm::PBMDesign constructor. As stated above, at minimum, the data.frame must contain two columns, "probeID" and "Sequence".

new_design <- data.frame(probeID = LETTERS[1:3],
                         Sequence = rep("AAA", 3))
pbmDesign <- PBMDesign(new_design)
pbmDesign

The probeFilter and probeTrim slots can also be specified directly to the upbm::PBMDesign constructor or modified after constructing the PBMDesign object, as shown above.

PBMExperiment

Probe-level PBM data is stored using the PBMExperiment class, an extension of the SummarizedExperiment class with additional slots to support working with PBMDesign objects.

An example PBMExperiment object containing data from a series of HOXC9 experiments is included in the upbmData package. We will only look at the PBMExperiment containing Alexa488 scans. However, the probe-level data from Cy3 scans are similarly stored and included as a separate PBMExperiment object in the package.

data(hoxc9alexa, package = "upbmData")
hoxc9alexa

A SummarizedExperiment object is made up of three parts: 1. assays (primary data, e.g. probe intensities), 2. column metadata, and 3. row metadata. Here, rows correspond to probes and columns correspond to scans. For more details on the SummarizedExperiment class, see the corresponding package vignette.

In addition to these three components, PBMExperiment objects also include probeFilter, probeTrim and probeCols slots. The probeFilter and probeTrim slots of the PBMExperiment class are identical to the slots described above as part of the PBMDesign class. The final probeCols slot is a character vector specifying the columns in the row metadata corresponding to the design of a PBMDesign object. Rather than storing the design as a separate slot in PBMExperiment objects, the design is kept in the row metadata and the corresponding columns are recorded in probeCols.

As with PBMDesign objects, these slots can be both accessed and modified. The hoxc9alexa object was constructed with the pbm_8x60k_v1 probe design introduced above. We can see that the probeTrim and probeFilter values match what we saw above.

probeFilter(hoxc9alexa)
probeTrim(hoxc9alexa)

Additionally, we can see that the probeCols columns of the row metadata match the design slot of pbm_8x60k_v1.

probeCols(hoxc9alexa)
head(rowData(hoxc9alexa)[, probeCols(hoxc9alexa)])

Finally, the corresponding PBMDesign object of a PBMExperiment can be extracted by passing the PBMExperiment object to upbm::PBMDesign.

PBMDesign(hoxc9alexa)

We can check that this is identical to the PBMDesign from above.

identical(PBMDesign(hoxc9alexa), pbm_8x60k_v1)

The PBMDesign associated values can be quickly modified or added using the associated setter function.

PBMDesign(hoxc9alexa) <- pbm_8x60k_v1

This can be useful when the design information needs to be added after the PBMExperiment object has been created. However, this should be performed with caution as all prior design associated values will be overwritten.

As with PBMDesign, calls to upbm::pbmFilterProbes and upbm::pbmTrimProbes can be used to filter and trim probes in the PBMExperiment object.

pbmFilterProbes(hoxc9alexa)

It is also important to note that whenever broom::tidy is called to convert assay data to a tidy format, upbm::pbmFilterProbes and upbm::pbmTrimProbes run unless process = FALSE is specified.

nrow(broom::tidy(hoxc9alexa))
nrow(broom::tidy(hoxc9alexa, process = FALSE))

gpr2PBMExperiment

The easiest way to create a new PBMExperiment object is by reading scan data from GPR files using upbm::gpr2PBMExperiment. Given a table of GPR file metadata, including paths to the files, the function will return the parsed GPR data as a PBMExperiment object.

In the case of the example dataset, the table used to read the GPR files is also included in the upbmData package.

data(hoxc9table, package = "upbmData")
head(hoxc9table, 5)

At a minimum, the full path to each GPR file must be specified in a column named gpr. The data.frame should also include any relevant metadata about the scan and sample (e.g. scan parameters or properties of the assayed protein).

Both Alexa488 and Cy3 scans for the example dataset in the upbmData package are provided in hoxc9table. We will subset to just the Alexa488 scans.

hoxc9table_alexa <- dplyr::filter(hoxc9table, type == "Alexa")

In addition to the sample table the corresponding PBMDesign should also be specified when calling upbm::gpr2PBMExperiment. While not run here (since raw GPR files are not included with the package), the following call was made to generate the PBMExperiment object included in the upbmData package.

hoxc9alexa <- gpr2PBMExperiment(scans = hoxc9table_alexa,
                                probes = pbm_8x60k_v1)

We can see that metadata included in the sample table are stored in the column metadata of the returned PBMExperiment object.

colData(hoxc9alexa)[1:5, ]

For PBMExperiment objects created from GPR files, probe-level foreground intensities are stored in the "fore" assay. By default, (but optionally) a second assay, "back" (background) is also read in from the GPRs with the background intensities for each probe and sample.

assay(hoxc9alexa, "fore")[1:5, 1:5]
assay(hoxc9alexa, "back")[1:5, 1:5]

New Objects

Alternatively, a new PBMExperiment object can be constructed from existing SummarizedExperiment and PBMDesign objects. To demonstrate this, we can coerce hoxc9alexa to a SummarizedExperiment object, stripping away all PBMExperiment-specific information.

se <- as(hoxc9alexa, "SummarizedExperiment")
se

Notice that the probe annotations are still kept in the row metadata. We can additionally remove this if we would like.

rowData(se) <- NULL
se

With the SummarizedExperiment object, we can now create a new PBMExperiment object using the PBMExperiment(..) constructor. As with gpr2PBMExperiment(..), a PBMDesign can be specified as an optional parameter.

PBMExperiment(se)

PBMExperiment(se, pbmDesign = pbm_8x60k_v1)


pkimes/upbm documentation built on Oct. 17, 2020, 9:10 a.m.