SparseArray-class: SparseArray objects

SparseArrayR Documentation

SparseArray objects

Description

The SparseArray package defines the SparseArray virtual class whose purpose is to be extended by other S4 classes that aim at representing in-memory multidimensional sparse arrays.

It has currently two concrete subclasses, COO_SparseArray and SVT_SparseArray, both also defined in this package. Each subclass uses its own internal representation for the nonzero multidimensional data, the COO layout for COO_SparseArray, and the SVT layout for SVT_SparseArray. The two layouts are described in the COO_SparseArray and SVT_SparseArray man pages, respectively.

Finally, the package also defines the SparseMatrix virtual class, as a subclass of the SparseArray class, for the specific 2D case.

Usage

## Constructor function:
SparseArray(x, type=NA)

Arguments

x

An ordinary matrix or array, or a dg[C|R]Matrix object, or an lg[C|R]Matrix object, or any matrix-like or array-like object that supports coercion to SVT_SparseArray.

type

A single string specifying the requested type of the object.

Normally, the SparseArray object returned by the constructor function has the same type() as x but the user can use the type argument to request a different type. Note that doing:

    sa <- SparseArray(x, type=type)

is equivalent to doing:

    sa <- SparseArray(x)
    type(sa) <- type

but the former is more convenient and will generally be more efficient.

Supported types are all R atomic types plus "list".

Details

The SparseArray class extends the Array virtual class defined in the S4Arrays package. Here is the full SparseArray sub-hierarchy as defined in the SparseArray package (virtual classes are marked with an asterisk):

: Array class :                 Array*
: hierarchy   :                   ^
                                  |
- - - - - - - - - - - - - - - - - | - - - - - - - - - - - - - - -
: SparseArray   :            SparseArray*
: sub-hierarchy :            ^    ^    ^
                             |    |    |
                 COO_SparseArray  |  SVT_SparseArray
                        ^         |         ^
- - - - - - - - - - - - | - - - - | - - - - | - - - - - - - - - -
: SparseMatrix      :   |    SparseMatrix*  |
: sub-sub-hierarchy :   |    ^         ^    |
                        |    |         |    |
                 COO_SparseMatrix    SVT_SparseMatrix

Any object that belongs to a class that extends SparseArray e.g. (a SVT_SparseArray or SVT_SparseMatrix object) is called a SparseArray derivative.

Most of the standard matrix and array API defined in base R should work on SparseArray derivatives, including dim(), length(), dimnames(), `dimnames<-`(), [, drop(), `[<-` (subassignment), t(), rbind(), cbind(), etc...

SparseArray derivatives also support type(), `type<-`(), is_sparse(), nzcount(), nzwhich(), nzvals(), `nzvals<-`(), sparsity(), arbind(), and acbind().

sparsity(x) returns the ratio between the number of zero-valued elements in array-like object x and its total number of elements (length(x) or prod(dim(x))). More precisely, sparsity(x) is 1 - nzcount(x)/length(x).

Value

A SparseArray derivative, that is a SVT_SparseArray, COO_SparseArray, SVT_SparseMatrix, or COO_SparseMatrix object.

The type() of the input object is preserved, except if a different one was requested via the type argument.

What is considered a zero depends on the type():

  • "logical" zero is FALSE;

  • "integer" zero is 0L;

  • "double" zero is 0;

  • "complex" zero is 0+0i;

  • "raw" zero is raw(1);

  • "character" zero is "" (empty string);

  • "list" zero is NULL.

See Also

  • The COO_SparseArray and SVT_SparseArray classes.

  • SparseArray_aperm for permuting the dimensions of a SparseArray object (e.g. transposition).

  • SparseArray_subsetting for subsetting a SparseArray object.

  • SparseArray_subassignment for SparseArray subassignment.

  • SparseArray_abind for combining 2D or multidimensional SparseArray objects.

  • SparseArray_summarization for SparseArray summarization methods.

  • SparseArray_Ops for operations from the Ops group on SparseArray objects.

  • SparseArray_Math for operations from the Math and Math2 groups on SparseArray objects.

  • SparseArray_Complex for operations from the Complex group on SparseArray objects.

  • SparseArray_misc for miscellaneous operations on a SparseArray object.

  • SparseArray_matrixStats for col/row summarization methods for SparseArray objects.

  • rowsum_methods for rowsum() methods for sparse matrices.

  • SparseMatrix_mult for SparseMatrix multiplication and cross-product.

  • randomSparseArray to generate a random SparseArray object.

  • readSparseCSV to read/write a sparse matrix from/to a CSV (comma-separated values) file.

  • S4 classes dgCMatrix, dgRMatrix, and lgCMatrix defined in the Matrix package, for the de facto standard for sparse matrix representations in the R ecosystem.

  • is_sparse in the S4Arrays package.

  • The Array class defined in the S4Arrays package.

  • Ordinary array objects in base R.

  • base::which in base R.

Examples

## ---------------------------------------------------------------------
## Display details of class definition & known subclasses
## ---------------------------------------------------------------------

showClass("SparseArray")

## ---------------------------------------------------------------------
## The SparseArray() constructor
## ---------------------------------------------------------------------

a <- array(rpois(9e6, lambda=0.3), dim=c(500, 3000, 6))
SparseArray(a)    # an SVT_SparseArray object

m <- matrix(rpois(9e6, lambda=0.3), ncol=500)
SparseArray(m)    # an SVT_SparseMatrix object

dgc <- sparseMatrix(i=c(4:1, 2:4, 9:12, 11:9), j=c(1:7, 1:7),
                    x=runif(14), dims=c(12, 7))
class(dgc)
SparseArray(dgc)  # an SVT_SparseMatrix object

dgr <- as(dgc, "RsparseMatrix")
class(dgr)
SparseArray(dgr)  # a COO_SparseMatrix object

## ---------------------------------------------------------------------
## nzcount(), nzwhich(), nzvals(), `nzvals<-`()
## ---------------------------------------------------------------------
x <- SparseArray(a)

## Get the number of nonzero array elements in 'x':
nzcount(x)

## nzwhich() returns the indices of the nonzero array elements in 'x'.
## Either as an integer (or numeric) vector of length 'nzcount(x)'
## containing "linear indices":
nzidx <- nzwhich(x)
length(nzidx)
head(nzidx)

## Or as an integer matrix with 'nzcount(x)' rows and one column per
## dimension where the rows represent "array indices" (a.k.a. "array
## coordinates"):
Mnzidx <- nzwhich(x, arr.ind=TRUE)
dim(Mnzidx)

## Each row in the matrix is an n-tuple representing the "array
## coordinates" of a nonzero element in 'x':
head(Mnzidx)
tail(Mnzidx)

## Extract the values of the nonzero array elements in 'x' and return
## them in a vector "parallel" to 'nzwhich(x)':
x_nzvals <- nzvals(x)  # equivalent to 'x[nzwhich(x)]'
length(x_nzvals)
head(x_nzvals)

nzvals(x) <- log1p(nzvals(x))
x

## Sanity checks:
stopifnot(identical(nzidx, which(a != 0)))
stopifnot(identical(Mnzidx, which(a != 0, arr.ind=TRUE, useNames=FALSE)))
stopifnot(identical(x_nzvals, a[nzidx]))
stopifnot(identical(x_nzvals, a[Mnzidx]))
stopifnot(identical(`nzvals<-`(x, nzvals(x)), x))

Bioconductor/SparseArray documentation built on Aug. 9, 2024, 6:38 p.m.