SparseArray-class: SparseArray objects

SparseArrayR Documentation

SparseArray objects

Description

The SparseArray package defines the SparseArray virtual class whose purpose is to be extended by other S4 classes that aim at representing in-memory multidimensional sparse arrays.

It has currently two concrete subclasses, COO_SparseArray and SVT_SparseArray, both also defined in this package. Each subclass uses its own internal representation for the nonzero multidimensional data, the COO layout for COO_SparseArray, and the SVT layout for SVT_SparseArray. The two layouts are described in the COO_SparseArray and SVT_SparseArray man pages, respectively.

Finally, the package also defines the SparseMatrix virtual class, as a subclass of the SparseArray class, for the specific 2D case.

Usage

## Constructor function:
SparseArray(x, type=NA)

Arguments

x

An ordinary matrix or array, or a dg[C|R]Matrix object, or an lg[C|R]Matrix object, or any matrix-like or array-like object that supports coercion to SVT_SparseArray.

type

A single string specifying the requested type of the object.

By default, the SparseArray object returned by the constructor function has the same type() as x. However the user can use the type argument to request a different type. Note that doing:

    sa <- SparseArray(x, type=type)

is equivalent to doing:

    sa <- SparseArray(x)
    type(sa) <- type

but the former is more convenient and will generally be more efficient.

Supported types are all R atomic types plus "list".

Details

The SparseArray class extends the Array virtual class defined in the S4Arrays package. Here is the full SparseArray sub-hierarchy as defined in the SparseArray package (virtual classes are marked with an asterisk):

: Array class :                 Array*
: hierarchy   :                   ^
                                  |
- - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - -
: SparseArray   :            SparseArray*
: sub-hierarchy :            ^    ^    ^
                             |    |    |
                 COO_SparseArray  |  SVT_SparseArray
                        ^         |         ^
- - - - - - - - - - - - | - - - - | - - - - | - - - - - - - - - -
: SparseMatrix      :   |    SparseMatrix*  |
: sub-sub-hierarchy :   |    ^         ^    |
                        |    |         |    |
                 COO_SparseMatrix    SVT_SparseMatrix

Any object that belongs to a class that extends SparseArray e.g. (a SVT_SparseArray or SVT_SparseMatrix object) is called a SparseArray derivative.

Most of the standard matrix and array API defined in base R should work on SparseArray derivatives, including dim(), length(), dimnames(), `dimnames<-`(), [, drop(), `[<-` (subassignment), t(), rbind(), cbind(), etc...

SparseArray derivatives also support type(), `type<-`(), is_sparse(), nzcount(), nzwhich(), nzvals(), `nzvals<-`(), sparsity(), arbind(), and acbind().

Value

A SparseArray derivative, that is a SVT_SparseArray, COO_SparseArray, SVT_SparseMatrix, or COO_SparseMatrix object.

The type() of the input object is preserved, except if a different one was requested via the type argument.

What is considered a zero depends on the type():

  • "logical" zero is FALSE;

  • "integer" zero is 0L;

  • "double" zero is 0;

  • "complex" zero is 0+0i;

  • "raw" zero is raw(1);

  • "character" zero is "" (empty string);

  • "list" zero is NULL.

See Also

  • The COO_SparseArray and SVT_SparseArray classes.

  • is_nonzero for is_nonzero() and nz*() functions nzcount(), nzwhich(), etc...

  • SparseArray_aperm for permuting the dimensions of a SparseArray object (e.g. transposition).

  • SparseArray_subsetting for subsetting a SparseArray object.

  • SparseArray_subassignment for SparseArray subassignment.

  • SparseArray_abind for combining 2D or multidimensional SparseArray objects.

  • SparseArray_summarization for SparseArray summarization methods.

  • SparseArray_Arith, SparseArray_Compare, and SparseArray_Logic, for operations from the Arith, Compare, and Logic groups on SparseArray objects.

  • SparseArray_Math for operations from the Math and Math2 groups on SparseArray objects.

  • SparseArray_Complex for operations from the Complex group on SparseArray objects.

  • SparseArray_misc for miscellaneous operations on a SparseArray object.

  • SparseArray_matrixStats for col/row summarization methods for SparseArray objects.

  • rowsum_methods for rowsum() methods for sparse matrices.

  • SparseMatrix_mult for SparseMatrix multiplication and cross-product.

  • randomSparseArray to generate a random SparseArray object.

  • readSparseCSV to read/write a sparse matrix from/to a CSV (comma-separated values) file.

  • S4 classes dgCMatrix, dgRMatrix, and lgCMatrix defined in the Matrix package, for the de facto standard for sparse matrix representations in the R ecosystem.

  • is_sparse in the S4Arrays package.

  • The Array class defined in the S4Arrays package.

  • Ordinary array objects in base R.

  • base::which in base R.

Examples

## ---------------------------------------------------------------------
## Display details of class definition & known subclasses
## ---------------------------------------------------------------------

showClass("SparseArray")

## ---------------------------------------------------------------------
## The SparseArray() constructor
## ---------------------------------------------------------------------

a <- array(rpois(9e6, lambda=0.3), dim=c(500, 3000, 6))
SparseArray(a)    # an SVT_SparseArray object

m <- matrix(rpois(9e6, lambda=0.3), ncol=500)
SparseArray(m)    # an SVT_SparseMatrix object

dgc <- sparseMatrix(i=c(4:1, 2:4, 9:12, 11:9), j=c(1:7, 1:7),
                    x=runif(14), dims=c(12, 7))
class(dgc)
SparseArray(dgc)  # an SVT_SparseMatrix object

dgr <- as(dgc, "RsparseMatrix")
class(dgr)
SparseArray(dgr)  # a COO_SparseMatrix object

## ---------------------------------------------------------------------
## nzcount(), nzwhich(), nzvals(), `nzvals<-`()
## ---------------------------------------------------------------------
x <- SparseArray(a)

## Get the number of nonzero array elements in 'x':
nzcount(x)

## nzwhich() returns the indices of the nonzero array elements in 'x'.
## Either as an integer (or numeric) vector of length 'nzcount(x)'
## containing "linear indices":
nzidx <- nzwhich(x)
length(nzidx)
head(nzidx)

## Or as an integer matrix with 'nzcount(x)' rows and one column per
## dimension where the rows represent "array indices" (a.k.a. "array
## coordinates"):
Mnzidx <- nzwhich(x, arr.ind=TRUE)
dim(Mnzidx)

## Each row in the matrix is an n-tuple representing the "array
## coordinates" of a nonzero element in 'x':
head(Mnzidx)
tail(Mnzidx)

## Extract the values of the nonzero array elements in 'x' and return
## them in a vector "parallel" to 'nzwhich(x)':
x_nzvals <- nzvals(x)  # equivalent to 'x[nzwhich(x)]'
length(x_nzvals)
head(x_nzvals)

nzvals(x) <- log1p(nzvals(x))
x

## Sanity checks:
stopifnot(identical(nzidx, which(a != 0)))
stopifnot(identical(Mnzidx, which(a != 0, arr.ind=TRUE, useNames=FALSE)))
stopifnot(identical(x_nzvals, a[nzidx]))
stopifnot(identical(x_nzvals, a[Mnzidx]))
stopifnot(identical(`nzvals<-`(x, nzvals(x)), x))

Bioconductor/SparseArray documentation built on Oct. 2, 2024, 6:52 p.m.