Description Usage Arguments Details Dimensions Subsetting Combining Densify Coercion Applying a function to a SparseAssays object (SAapply) Author(s) See Also Examples
The SparseAssays virtual class and its methods provide a formal abstraction of the sparseAssays slot of SparseSummarizedExperiment and RangedSparseSummarizedExperiment objects.
SimpleListSparseAssays and SimpleListJointSparseAssays (not yet implemented) are concrete subclasses of SparseAssays with the former being currently the default implementation of SparseAssays objects. Other implementations (e.g. disk-based, environment-based) could easily be added.
Note that these classes are not meant to be used directly by the end-user and the material in this man page is aimed at package developers.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | ## Constructor
SparseAssays(sparse_assays = SimpleList(), subclass)
## Accessors
## S4 method for signature 'SparseAssays'
length(x)
## S4 method for signature 'SparseAssays'
NROW(x)
## S4 method for signature 'SparseAssays'
names(x)
## S4 replacement method for signature 'SparseAssays'
names(x) <- value
## S4 method for signature 'SparseAssays,ANY,ANY'
x[[i, j, ...]]
## S4 replacement method for signature 'SparseAssays,ANY,ANY'
x[[i, j, ...]] <- value
## Densify a SparseAssays object
densify(x, i, j, ..., withRownames = TRUE)
## Apply a function to a SparseAssays object
SAapply(X, FUN, densify = TRUE, sparsify = !densify,
withRownames = TRUE, ...,
BPREDO = list(), BPPARAM = bpparam())
|
sparse_assays |
A SimpleList or list that can be used to construct a SparseAssays instance; see ‘Examples’. |
subclass |
The concrete subclass to be instantiated. The default is SimpleListSparseAssays. |
x |
A SparseAssays object. |
i, j |
For For |
value |
An object of a class specified in the S4 method signature or as outlined in ‘Details’. |
withRownames |
A For |
X |
A SparseAssays object. |
FUN |
The function to be applied to each element of |
densify |
A |
sparsify |
A |
... |
Optional arguments to |
BPREDO, BPPARAM |
See |
SparseAssays objects have a list-like semantics with elements containing key and value elements.
The SparseAssays API consists of:
(a) The SparseAssays()
constructor function.
(b) Lossless back and forth coercion from/to
SimpleList
. The coercion
method from SimpleList
doesn't need (and should
not) validate the returned object.
(c) length
, NROW
, names
,
names<-
, [[
, [[<-
.
(d) dim
, dimnames
, [
,
[<-
, rbind
, cbind
,
combine
, densify
, SAapply
.
A SparseAssays concrete subclass needs to implement (b) (required) plus
the methods in (d) (required). The methods in (c) are inherited from the
SimpleList
class. Each element of a SparseAssays
object is referred to as a "sparse assay" (lowercase).
IMPORTANT: Methods that return a modified SparseAssays object
(a.k.a. endomorphisms), that is, [
as well as replacement methods
names<-
, [[<-
, and [<-
, must respect the
copy-on-change contract.
With objects that don't make use of references internally, the developer
doesn't need to take any special action for that because it's automatically
taken care of by R itself. However, for objects that do make use of
references internally (e.g. environments, external pointers, pointer to a
file on disk, etc...), the developer needs to be careful to implement
endomorphisms with copy-on-change semantics. This can easily be achieved by
performaing a full (deep) copy of the object before modifying it instead of
trying to modify it in-place. Note that the full (deep) copy is not always
necessary in order to achieve copy-on-change semantics: it's enough (and
often preferrable for performance reasons) to copy only the parts of the
objects that need to be modified.
SparseAssays has currently 1 implementation formalized by concrete subclass SimpleListSparseAssays. There are written specs for a second formalization, SimpleListJointSparseAssays, although this is not yet implemented.
The sparseAssays slot of a SparseSummarizedExperiment object contains an instance of SimpleListSparseAssays.
NOTE: SparseAssays only payoff compared to
SummarizedExperiment::Assays
when you
get more than one measurement per-feature, per-sample. The payoff is greater
when there are lots of features with the same measurement (normally within a
sample, although SimpleListJointSparseAssays should allow this constraint to
be removed) and/or lots of NAs per-sample.
The dimensions of a SparseAssays object are defined by nrow = length of features (usually the length of the key), and ncol = number of samples.
Subsetting with [
uses i
to subset rows/features in each
sparse assay and j
to subset samples in each sparse assay.
NOTE: Use [[
with i
to select the i
-th sparse
assay.
SparseAssays objects can be combined in three different ways.
rbind
Suitable for when each object has the same samples.
cbind
Suitable for when each object has unique samples.
combine
Suitable in either case, however, requires
that dimnames
are set on each object and that all objects have an
identical number of sparse assays with identical names.
SparseAssays objects can be densified (expanded) using the
densify()
method. For each sample, the densified data for a single
sparse assay is returned as a matrix. Therefore, the densify
generic
returns a SimpleList of length = length(i)
, each
containing a SimpleList
of length =
length{j}
, each containing a matrix
of the densified data for
that sample in that sparse assay.
WARNING: It is generally advisable to not simulatenously densify
all sparse assays in all samples since the entire point of using
SparseAssays is to use a more memory-efficient storage of the data.
Therefore, users must provide at least one of i
(to select sparse
assays) and j
(to select samples). If you really wish to
simultaneously densify all sparse assays and samples, then use
densify(x, seq_along(x), seq_len(ncol(x)))
. If i
(resp.
j
) is missing then effectively i = seq_along(x)
(resp.
j = seq_len(ncol(x))
).
SparseAssays objects can be coerced into a
ShallowSimpleListAssays object (from the
SummarizedExperiment package); this will also densify the object.
This can be done using as(x, "ShallowSimpleListAssays")
, where
x
is a SparseAssays object. WARNING: The resulting
ShallowSimpleListAssays object will typically
require much more memory than the equivalent SparseAssays object.
SAapply
)A common use case is to apply a function to a SparseAssays object. For
example, we might wish to compute the column-wise mean(s) for each sample
in a sparse assay. SAapply
is designed to do this in an efficient
manner with an interface that is modelled on the lapply
functional in base R.
SAapply
takes a SparseAssays object (X
) and
applies a single function (FUN
) to each sample in each sparse assay.
It is worth emphasising that this means that the same function is applied to
all samples and sparse assays in X
(use sparseAssay()
with the
i
argument to extract specific sparse assays).
While it is desirable to apply FUN
to the data in its sparse form,
this is not always possible and the data may need to be densified prior to
FUN
being applied. The SAapply
method simplifies this process
in two ways:
SAapply
allows the user to pass a function, FUN
, that
works on sparse or dense data. The densify
argument specifies
whether the data need to be densified prior to FUN
being applied.
If the data need to be densified, then SAaaply
does this in a
memory-efficient manner. For example, it will serially densify each
sample in each sparse assay and apply FUN
before moving onto the
next sample's data (this is appropriately generalised if the user
specifies a non-serial BiocParallelParam
backend via the BPPARAM
argument).
Parallelisation is implemented via the BiocParallel package. Please
consult its documentation for further details on parallelisation options,
in particular the ?BiocParallelParam
help page.
Finally, the sparsify
argument determines the class of the return
value of SAapply()
. If sparsify = FALSE
, the return value is a
nested list where the first level is the sparse assay and the
second level is the sample-level data as a dense matrix. If
sparsify = TRUE
, the return value is a SparseAssays object
with the same concrete subclass as X
. By default,
sparsify = !densify
, that is, sparse data will remain sparse and
densified data will remain densified. The use of densify = TRUE
allows the output of SAapply()
to be used as the value
in a
call to sparseAssays(x) <- value
; see ‘Examples’.
NOTE: The generic is called SAapply
rather than
saapply
to reduce the confusion/typo-rate with sapply
.
Peter Hickey, peter.hickey@gmail.com
SimpleListSparseAssays objects, the current default concrete subclass of the SparseAssays virtual class.
SparseSummarizedExperiment objects, which use a SparseAssays
object in the sparseAssays
slot.
1 | # See ?SimpleListSparseAssays
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.