RectangularData-class: RectangularData objects

RectangularData-classR Documentation

RectangularData objects

Description

RectangularData is a virtual class with no slots to be extended by classes that aim at representing objects with a 2D rectangular shape.

Some examples of RectangularData extensions are:

  • The DataFrame class defined in this package (S4Vectors).

  • The DelayedMatrix class defined in the DelayedArray package.

  • The SummarizedExperiment and Assays classes defined in the SummarizedExperiment package.

Details

Any object that belongs to a class that extends RectangularData is called a RectangularData derivative.

Users should be able to access and manipulate RectangularData derivatives via the standard 2D API defined in base R, that is, using things like dim(), nrow(), ncol(), dimnames(), the 2D form of [ (x[i, j]), rbind(), cbind(), etc...

Not all RectangularData derivatives will necessarily support the full 2D API but they must support at least dim(), nrow(x), ncol(x), NROW(x), and NCOL(x). And of course, dim() must return an integer vector of length 2 on any of these objects.

Developers who implement RectangularData extensions should also make sure that they support low-level operations bindROWS() and bindCOLS().

Accessors

In the following code snippets, x is a RectangularData derivative. Not all RectangularData derivatives will support all these accessors.

dim(x):

Length two integer vector defined as c(nrow(x), ncol(x)). Must work on any RectangularData derivative.

nrow(x), ncol(x):

Get the number of rows and columns, respectively. Must work on any RectangularData derivative.

NROW(x), NCOL(x):

Same as nrow(x) and ncol(x), respectively. Must work on any RectangularData derivative.

dimnames(x):

Length two list of character vectors defined as list(rownames(x), colnames(x)).

rownames(x), colnames(x):

Get the names of the rows and columns, respectively.

Subsetting

In the code snippets below, x is a RectangularData derivative.

x[i, j, drop=TRUE]:

Return a new RectangularData derivative of the same class as x made of the selected rows and columns.

For single row and/or column selection, the drop argument specifies whether or not to "drop the dimensions" of the result. More precisely, when drop=TRUE (the default), a single row or column is returned as a vector-like object (of length/NROW equal to ncol(x) if a single row, or equal to nrow(x) if a single column).

Not all RectangularData derivatives support the drop argument. For example DataFrame and DelayedMatrix objects support it (only for a single column selection for DataFrame objects), but SummarizedExperiment objects don't (drop is ignored for these objects and subsetting always returns a SummarizedExperiment derivative of the same class as x).

head(x, n=6L):

If n is non-negative, returns the first n rows of the RectangularData derivative. If n is negative, returns all but the last abs(n) rows of the RectangularData derivative.

tail(x, n=6L):

If n is non-negative, returns the last n rows of the RectangularData derivative. If n is negative, returns all but the first abs(n) rows of the RectangularData derivative.

subset(x, subset, select, drop=FALSE):

Return a new RectangularData derivative using:

subset

logical expression indicating rows to keep, where missing values are taken as FALSE.

select

expression indicating columns to keep.

drop

passed on to [ indexing operator.

Combining

In the code snippets below, all the input objects are expected to be RectangularData derivatives.

rbind(...):

Creates a new RectangularData derivative by aggregating the rows of the input objects.

cbind(...):

Creates a new RectangularData derivative by aggregating the columns of the input objects.

combineRows(x, ...):

Creates a new RectangularData derivative (of the same class as x) by aggregating the rows of the input objects. Unlike rbind(), combineRows() will handle cases involving differences in the column names of the input objects by adding the missing columns to them, and filling these columns with NAs. The column names of the returned object are a union of the column names of the input objects.

Behaves like an endomorphism with respect to its first argument i.e. returns an object of the same class as x.

Finally note that this is a generic function with methods defined for DataFrame objects and other RectangularData derivatives.

combineCols(x, ..., use.names=TRUE):

Creates a new RectangularData derivative (of the same class as x) by aggregating the columns of the input objects. Unlike cbind(), combineCols() will handle cases involving differences in the number of rows of the input objects.

If use.names=TRUE, all objects are expected to have non-NULL, non-duplicated row names. These row names do not have to be the same, or even shared, across the input objects. Missing rows in any individual input object are filled with NAs, such that the row names of the returned object are a union of the row names of the input objects.

If use.names=FALSE, all objects are expected to have the same number of rows, and this function behaves the same as cbind(). The row names of the returned object is set to rownames(x). Differences in the row names between input objects are ignored.

Behaves like an endomorphism with respect to its first argument i.e. returns an object of the same class as x.

Finally note that this is a generic function with methods defined for DataFrame objects and other RectangularData derivatives.

combineUniqueCols(x, ..., use.names=TRUE):

Same as combineCols(), but this function will attempt to collapse multiple columns with the same name across the input objects into a single column in the output. This guarantees that the column names in the output object are always unique. The only exception is for unnamed columns, which are not collapsed. The function works on any rectangular objects for which combineCols() works.

When use.names=TRUE, collapsing is only performed if the duplicated column has identical values for the shared rows in the input objects involved. Otherwise, the contents of the later input object is simply ignored with a warning. Similarly, if use.names=FALSE, the duplicated columns must be identical for all rows in the affected input objects.

Behaves like an endomorphism with respect to its first argument i.e. returns an object of the same class as x.

Finally note that this function is implemented on top of combineCols() and is expected to work on any RectangularData derivatives for which combineCols() works.

Author(s)

Hervé Pagès and Aaron Lun

See Also

  • DataFrame for a RectangularData extension that mimics data.frame objects from base R.

  • DataFrame-combine for combineRows(), combineCols(), and combineUniqueCols() examples involving DataFrame objects.

  • data.frame objects in base R.

Examples

showClass("RectangularData")  # shows (some of) the known subclasses

Bioconductor/S4Vectors documentation built on Jan. 9, 2025, 7:24 a.m.