containsOutOfMemoryData: Does an object contain out-of-memory data?

containsOutOfMemoryDataR Documentation

Does an object contain out-of-memory data?

Description

Some objects in Bioconductor can use on-disk or other out-of-memory representation for their data, typically (but not necessarily) when the data is too big to fit in memory. For example the data in a TxDb object is stored in an SQLite database, and the data in an HDF5Array object is stored in an HDF5 file.

The containsOutOfMemoryData() function determines whether an object contains out-of-memory data or not.

Note that objects with out-of-memory data are usually not compatible with a serialization/unserialization roundtrip. More concretely, base::saveRDS()/base::readRDS() tend to silently break them!

See ?saveHDF5SummarizedExperiment in the HDF5Array package for a more extensive discussion about this.

Usage

containsOutOfMemoryData(object)

Arguments

object

The object to be tested.

Details

An object can store some of its data on disk and some of it in memory. This is the case for example when a SummarizedExperiment object (or derivative) has some of its assays on disk (e.g. in HDF5Matrix objects) and others in memory (e.g. in ordinary matrices and/or SparseMatrix objects).

Of course in this case, containsOutOfMemoryData() will still return TRUE. In other words, containsOutOfMemoryData(object) will only return FALSE when all the data in object resides in memory, that is, when the object can safely be serialized.

Value

TRUE or FALSE.

Note

TO DEVELOPERS:

The BiocGenerics package also defines the following:

  • A default containsOutOfMemoryData() method that returns TRUE if object is an S4 object with at least one slot for which containsOutOfMemoryData() is TRUE (recursive definition), and FALSE otherwise.

  • A containsOutOfMemoryData() method for list objects that returns TRUE if object has at least one list element for which containsOutOfMemoryData() is TRUE (recursive definition), and FALSE otherwise.

  • A containsOutOfMemoryData() method for environment objects that returns TRUE if object contains at least one object for which containsOutOfMemoryData() is TRUE (recursive definition), and FALSE otherwise.

  • The OutOfMemoryObject class. This is a virtual S4 class with no slots that any class defined in Bioconductor that represents out-of-memory objects should extend.

  • A containsOutOfMemoryData() method for OutOfMemoryObject derivatives that returns TRUE.

Therefore, if you implement a class that uses an out-of-memory representation, make sure that it contains the OutOfMemoryObject class. This will make containsOutOfMemoryData() return TRUE on your objects, so you don't need to define a containsOutOfMemoryData() method for them.

See Also

  • showMethods for displaying a summary of the methods defined for a given generic function.

  • selectMethod for getting the definition of a specific method.

  • BiocGenerics for a summary of all the generics defined in the BiocGenerics package.

Examples

containsOutOfMemoryData
showMethods("containsOutOfMemoryData")

## The default method:
selectMethod("containsOutOfMemoryData", "ANY")

## The method for list objects:
selectMethod("containsOutOfMemoryData", "list")

## The method for OutOfMemoryObject derivatives:
selectMethod("containsOutOfMemoryData", "OutOfMemoryObject")

m <- matrix(0, nrow=7, ncol=10)
m[sample(length(m), 20)] <- runif(20)
containsOutOfMemoryData(m)  # FALSE

library(SparseArray)
svt <- as(m, "SparseArray")
svt
containsOutOfMemoryData(m)  # FALSE
containsOutOfMemoryData(list(m, svt))  # FALSE

library(HDF5Array)
M <- as(m, "HDF5Array")
M
containsOutOfMemoryData(M)  # TRUE
containsOutOfMemoryData(list(m, svt, M))  # TRUE

Bioconductor/BiocGenerics documentation built on Nov. 17, 2024, 6:52 p.m.