suppressPackageStartupMessages({
    library(RaggedExperiment)
    library(GenomicRanges)
})

Introduction

The r Biocpkg("RaggedExperiment") package provides a flexible data representation for copy number, mutation and other ragged array schema for genomic location data. It aims to provide a framework for a set of samples that have differing numbers of genomic ranges.

The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset. The row and column dimensions of the RaggedExperiment correspond to the number of ranges in the entire dataset and the number of samples represented in the data, respectively.

Installation

source("https://bioconductor.org/biocLite.R")
BiocInstaller::biocLite("RaggedExperiment")

Loading the package:

library(RaggedExperiment)

Constructing a RaggedExperiment object

We start with a couple of GRanges objects, each representing an individual sample:

sample1 <- GRanges(
    c(GENEA = "chr1:1-10:-", GENEB = "chr2:15-18:+"),
    score = 3:4)
sample2 <- GRanges(
    c(GENEC = "chr1:1-10:-", GENED = "chr2:11-18:+"),
    score = 1:2)

Include column data colData to describe the samples:

colDat <- DataFrame(id = 1:2)

Using GRanges objects

ragexp <- RaggedExperiment(sample1 = sample1,
                           sample2 = sample2,
                           colData = colDat)
ragexp

Using a GRangesList instance

grl <- GRangesList(sample1 = sample1, sample2 = sample2)
RaggedExperiment(grl, colData = colDat)

Using a list of GRanges

rangeList <- list(sample1 = sample1, sample2 = sample2)
RaggedExperiment(rangeList, colData = colDat)

Using a List of GRanges with metadata

Note: In cases where a SimpleGenomicRangesList is provided along with accompanying metadata (accessed by mcols), the metadata is used as the colData for the RaggedExperiment.

grList <- List(sample1 = sample1, sample2 = sample2)
mcols(grList) <- colDat
RaggedExperiment(grList)

Accessors

Range data

rowRanges(ragexp)

Dimension names

dimnames(ragexp)

colData

colData(ragexp)

Subsetting

by dimension

Subsetting a RaggedExperiment is akin to subsetting a matrix object. Rows correspond to genomic ranges and columns to samples or specimen. It is possible to subset using integer, character, and logical indices.

by genomic ranges

The overlapsAny and subsetByOverlaps functionalities are available for use for RaggedExperiment. Please see the corresponding documentation in RaggedExperiment and GenomicRanges.

*Assay functions

RaggedExperiment package provides several different functions for representing ranged data in a rectangular matrix via the *Assay methods.

sparseAssay

The most straightforward matrix representation of a RaggedExperiment will return a matrix of dimensions equal to the product of the number of ranges and samples.

dim(ragexp)
Reduce(`*`, dim(ragexp))
sparseAssay(ragexp)
length(sparseAssay(ragexp))

compactAssay

Samples with identical ranges are placed in the same row. Non-disjoint ranges are not collapsed.

compactAssay(ragexp)

disjoinAssay

This function returns a matrix of disjoint ranges across all samples. Elements of the matrix are summarized by applying the simplify functional argument to assay values of overlapping ranges.

disjoinAssay(ragexp, simplify = mean)

qreduceAssay

The qreduceAssay function works with a query parameter that highlights a window of ranges for the resulting matrix. The returned matrix will have dimensions length(query) by ncol(x). Elements contain assay values for the i th query range and the j th sample, summarized according to the simplify functional argument.

First we define our summary function that calculates a weighted average score per query range. Note that there are three arguments to this function. Please see the documentation ?qreduceAssay for more details.

weightedmean <- function(scores, ranges, qranges)
    sum(scores * width(ranges)) / sum(width(ranges))

A call to qreduceAssay involves the RaggedExperiment, the GRanges query and the simplify functional argument.

qreduceAssay(ragexp,
             query = GRanges(c("chr1:1-10:-", "chr2:11-18:+")), 
             simplify = weightedmean)

Coercion

The RaggedExperiment provides a family of parallel functions for coercing to the SummarizedExperiment class. By selecting a particular assay index (i), a parallel assay coercion method can be achieved.

Here is the list of function names:

See the documentation for details.

Session Information

sessionInfo()


Bioconductor-mirror/RaggedExperiment documentation built on Aug. 10, 2017, 10:44 a.m.