DataTable objects

Share:

Description

DataTable is an API only (i.e. virtual class with no slots) for accessing objects with a rectangular shape like DataFrame or RangedData objects. It mimics the API for standard data.frame objects.

Accessors

In the following code snippets, x is a DataTable.

nrow(x), ncol(x): Get the number of rows and columns, respectively.

NROW(x), NCOL(x): Same as nrow(x) and ncol(x), respectively.

dim(x): Length two integer vector defined as c(nrow(x), ncol(x)).

rownames(x), colnames(x): Get the names of the rows and columns, respectively.

dimnames(x): Length two list of character vectors defined as list(rownames(x), colnames(x)).

Subsetting

In the code snippets below, x is a DataTable object.

x[i, j, drop=TRUE]: Return a new DataTable object made of the selected rows and columns. For single column selection, the drop argument specifies whether or not to coerce the returned sequence to a standard vector.

head(x, n=6L): If n is non-negative, returns the first n rows of the DataTable object. If n is negative, returns all but the last abs(n) rows of the DataTable object.

tail(x, n=6L): If n is non-negative, returns the last n rows of the DataTable object. If n is negative, returns all but the first abs(n) rows of the DataTable object.

subset(x, subset, select, drop=FALSE): Return a new DataTable object using:

subset

logical expression indicating rows to keep, where missing values are taken as FALSE.

select

expression indicating columns to keep.

drop

passed on to [ indexing operator.

na.omit(object): Returns a subset with incomplete cases removed.

na.exclude(object): Returns a subset with incomplete cases removed (but to be included with NAs in statistical results).

is.na(x): Returns a logical matrix indicating which cells are missing.

complete.cases(x): Returns a logical vector identifying which cases have no missing values.

Combining

In the code snippets below, x is a DataTable object.

cbind(...): Creates a new DataTable by combining the columns of the DataTable objects in ....

rbind(...): Creates a new DataTable by combining the rows of the DataTable objects in ....

merge(x, y, ...): Merges two DataTable objects x and y, with arguments in ... being the same as those allowed by the base merge. It is allowed for either x or y to be a data.frame.

Looping

In the code snippets below, x is a DataTable object.

by(data, INDICES, FUN, ..., simplify = TRUE): Apply FUN to each group of data, a DataTable, formed by the factor (or list of factors) INDICES. Exactly the same contract as as.data.frame.

Utilities

duplicated(x): Returns a logical vector indicating the rows that are identical to a previous row.

unique(x): Returns a new DataTable after removing the duplicated rows from x.

show(x): By default the show method displays 5 head and 5 tail lines. The number of lines can be altered by setting the global options showHeadLines and showTailLines. If the object length is less than the sum of the options, the full object is displayed. These options affect GRanges, GAlignments, Ranges, DataTable and XString objects.

Coercion

as.env(x, enclos = parent.frame()): Creates an environment from x with a symbol for each colnames(x). The values are not actually copied into the environment. Rather, they are dynamically bound using makeActiveBinding. This prevents unnecessary copying of the data from the external vectors into R vectors. The values are cached, so that the data is not copied every time the symbol is accessed.

Statistical modeling with DataTable

A number of wrappers are implemented for performing statistical procedures, such as model fitting, with DataTable objects.

Tabulation

xtabs(formula = ~., data, subset, na.action, exclude = c(NA, NaN), drop.unused.levels = FALSE): Like the original xtabs, except data is a DataTable.

See Also

  • DataFrame for an implementation that mimics data.frame.

  • data.frame

Examples

1
2
3
4
5
showClass("DataTable")  # shows (some of) the known subclasses

library(IRanges)
df <- DataFrame(as.data.frame(UCBAdmissions))
xtabs(Freq ~ Gender + Admit, df)