matter_df-class: Out-of-Memory Data Frames

Description Usage Arguments Value Slots Extends Creating Objects Methods Author(s) See Also Examples

Description

The virtual_df class implements lightweight data frames that may be a mixture of atomic vectors and matter vectors, simulating the behavior of data.frame.

The matter_df class extends virtual_df to implement fully out-of-memory data frames where all columns are matter objects.

Calling as.matter() on an ordinary R data.frame will coerce all columns to matter objects to create a matter_df data frame.

Usage

1
2
3
4
5
6
## Instance creation
virtual_df(..., row.names = NULL, stringsAsFactors = default.stringsAsFactors())

matter_df(..., row.names = NULL, stringsAsFactors = default.stringsAsFactors())

## Additional methods documented below

Arguments

...

These arguments become the data columns or data frame variables. They should be named.

row.names

A character vector giving the row names.

stringsAsFactors

Should character vectors be converted to factors? This is recommended for matter_df, as accessing the underlying out-of-memory integer vectors (for a factor) is typically much faster than accessing a vector of out-of-memory strings.

Value

An object of class virtual_df or matter_df.

Slots

data:

This slot stores the information about locations of the data in virtual memory and within the files.

datamode:

The storage mode of the accessed data when read into R. This is a 'character' vector of with possible values 'raw', 'logical', 'integer', 'numeric', or 'virtual'.

paths:

A 'character' vector of the paths to the files where the data are stored.

filemode:

The read/write mode of the files where the data are stored. This should be 'r' for read-only access, or 'rw' for read/write access.

chunksize:

The maximum number of elements which should be loaded into memory at once. Used by methods implementing summary statistics and linear algebra. Ignored when explicitly subsetting the dataset.

length:

The length of the data.

dim:

Either 'NULL' for vectors, or an integer vector of length one of more giving the maximal indices in each dimension for matrices and arrays.

names:

The names of the data elements for vectors.

dimnames:

Either 'NULL' or the names for the dimensions. If not 'NULL', then this should be a list of character vectors of the length given by 'dim' for each dimension. This is always 'NULL' for vectors.

ops:

Delayed operations to be applied on atoms.

Extends

matter

Creating Objects

virtual_df instances can be created through virtual_df().

matter_df instances can be created through matter_df().

Methods

Standard generic methods:

x$name, x$name <- value:

Get or set a single column.

x[[i]], x[[i]] <- value:

Get or set a single column.

x[i], x[i] <- value:

Get or set multiple columns.

x[i, j, ..., drop], x[i, j] <- value:

Get or set the elements of the data frame.

Author(s)

Kylie A. Bemis

See Also

matter

Examples

1
2
3
4
5
6
7
8
9
x <- matter_df(a=as.matter(1:10), b=11:20, c=as.matter(letters[1:10]))
x
x[1:2]
x[[2]]
x[["c"]]
x[,"c"]
x[1:5,c("a","c")]
x$c
x$c[1:5]

matter documentation built on Nov. 8, 2020, 6:15 p.m.