sparse_mat-class: Sparse Matrices

Description Usage Arguments Value Slots Warning Extends Creating Objects Methods Author(s) See Also Examples

Description

The sparse_mat class implements sparse matrices, potentially stored out-of-memory. Both compressed-sparse-column (CSC) and compressed-sparse-row (CSR) formats are supported. Non-zero elements are internally represented as key-value pairs.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Instance creation
sparse_mat(data, datamode = "double", nrow = 0, ncol = 0,
            rowMaj = FALSE, dimnames = NULL, keys = NULL,
            tolerance = c(abs=0), combiner = "identity",
            chunksize = getOption("matter.default.chunksize"), ...)

# Check if an object is a sparse matrix
is.sparse(x)

# Coerce an object to a sparse matrix
as.sparse(x, ...)

## Additional methods documented below

Arguments

data

Either a length-2 'list' with elements 'keys' and 'values' which provide the halves of the key-value pairs of the non-zero elements, or a data matrix that will be used to initialized the sparse matrix. If a list is given, all 'keys' elements must be sorted in increasing order.

datamode

A 'character' vector giving the storage mode of the data in virtual memory. Allowable values are R numeric and logical types ('logical', 'integer', 'numeric') and their C equivalents.

nrow

An optional number giving the total number of rows.

ncol

An optional number giving the total number of columns.

keys

Either NULL or a vector with length equal to the number of rows (for CSC matrices) or the number of columns (for CSR matrices). If NULL, then the 'key' portion of the key-value pairs that make up the non-zero elements are assumed to be row or column indices. If a vector, then they define the how the non-zero elements are matched to rows or columns. The 'key' portion of each non-zero element is matched against this canonical set of keys using binary search. Allowed types for keys are 'integer', 'numeric', and 'character'.

rowMaj

Whether the data should be stored using compressed-sparse-row (CSR) representation (as opposed to compressed-sparse-column (CSC) representation). Defaults to 'FALSE', for efficient access to columns. Set to 'TRUE' for more efficient access to rows instead.

dimnames

The names of the sparse matrix dimensions.

tolerance

For 'numeric' keys, the tolerance used for floating-point equality when determining key matches. The vector should be named. Use 'absolute' to use absolute differences, and 'relative' to use relative differences.

combiner

In the case of collisions when matching keys, how the row- or column-vectors should be combined. Acceptable values are "identity", "min", "max", "sum", and "mean". A user-specified function may also be provided. Using "identity" means collisions result in an error. Using "sum" or "mean" results in binning all matches.

chunksize

The (suggested) maximum number of elements which should be accessed at once by summary functions and linear algebra. Ignored when explicitly subsetting the dataset.

x

An object to check if it is a sparse matrix or coerce to a sparse matrix.

...

Additional arguments to be passed to constructor.

Value

An object of class sparse_mat.

Slots

data:

A length-2 'list' with elements 'keys' and 'values' which provide the halves of the key-value pairs of the non-zero elements.

datamode:

The storage mode of the accessed data when read into R. This should a 'character' vector of length one with value 'integer' or 'numeric'.

paths:

A 'character' vector of the paths to the files where the data are stored.

filemode:

The read/write mode of the files where the data are stored. This should be 'r' for read-only access, or 'rw' for read/write access.

chunksize:

The maximum number of elements which should be loaded into memory at once. Used by methods implementing summary statistics and linear algebra. Ignored when explicitly subsetting the dataset.

length:

The length of the data.

dim:

Either 'NULL' for vectors, or an integer vector of length one of more giving the maximal indices in each dimension for matrices and arrays.

names:

The names of the data elements for vectors.

dimnames:

Either 'NULL' or the names for the dimensions. If not 'NULL', then this should be a list of character vectors of the length given by 'dim' for each dimension. This is always 'NULL' for vectors.

ops:

Delayed operations to be applied on atoms.

keys

Either NULL or a vector with length equal to the number of rows (for CSC matrices) or the number of columns (for CSR matrices). If NULL, then the 'key' portion of the key-value pairs that make up the non-zero elements are assumed to be row or column indices. If a vector, then they define the how the non-zero elements are matched to rows or columns. The 'key' portion of each non-zero element is matched against this canonical set of keys using binary search. Allowed types for keys are 'integer', 'numeric', and 'character'.

tolerance:

For 'numeric' keys, the tolerance used for floating-point equality when determining key matches. An attribute 'type' gives whether 'absolute' or 'relative' differences should be used for the comparison.

combiner:

This is a function determining how the row- or column-vectors should be combined (or not) when key matching collisions occur.

Warning

If 'data' is given as a length-2 list of key-value pairs, no checking is performed on the validity of the key-value pairs, as this may be a costly operation if the list is stored in virtual memory. Each element of the 'keys' element must be sorted in increasing order, or behavior may be unexpected.

Assigning a new data element to the sparse matrix will always sort the key-value pairs of the row or column into which it was assigned.

Extends

matter

Creating Objects

sparse_mat instances can be created through sparse_mat().

Methods

Standard generic methods:

x[i, j, ..., drop], x[i, j] <- value:

Get or set the elements of the sparse matrix. Use drop = NULL to return a subset of the same class as the object.

cbind(x, ...), rbind(x, ...):

Combine sparse matrices by row or column.

t(x):

Transpose a matrix. This is a quick operation which only changes metadata and does not touch the data representation.

Author(s)

Kylie A. Bemis

See Also

matter

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
keys <- list(
    c(1,4,8,10),
    c(2,3,5),
    c(1,2,7,9))

values <- list(
    rnorm(4),
    rnorm(3),
    rnorm(4))

init1 <- list(keys=keys, values=values)

x <- sparse_mat(init1, nrow=10)
x[]

init2 <- matrix(rbinom(100, 1, 0.2), nrow=10, ncol=10)

y <- sparse_mat(init2, keys=letters[1:10])
y[]

matter documentation built on Nov. 8, 2020, 6:15 p.m.