DelayedArray-utils: Common operations on DelayedArray objects

DelayedArray-utilsR Documentation

Common operations on DelayedArray objects

Description

Common operations on DelayedArray objects.

Details

The operations currently supported on DelayedArray objects are:

Delayed operations:

  • rbind and cbind

  • all the members of the Ops, Math, and Math2 groups

  • !

  • is.na, is.finite, is.infinite, is.nan

  • type<-

  • lengths

  • nchar, tolower, toupper, grepl, sub, gsub

  • pmax2 and pmin2

  • sweep

  • scale (when the supplied center and scale are not TRUE)

  • paste2, add_prefix, add_suffix

  • statistical functions like dnorm, dbinom, dpois, and dlogis (for the Normal, Binomial, Poisson, and Logistic distribution, respectively) and related functions (documented in DelayedArray-stats)

Block-processed operations:

  • anyNA, which, nzwhich

  • unique, table

  • all the members of the Summary group

  • mean

  • apply

Mix delayed and block-processed operations:

  • scale (when the supplied center and/or scale are TRUE)

See Also

  • cbind in the base package for rbind/cbind'ing ordinary arrays.

  • arbind and acbind in this package (DelayedArray) for binding ordinary arrays of arbitrary dimensions along their rows or columns.

  • is.na, !, table, mean, apply, and %*% in the base package for the corresponding operations on ordinary arrays or matrices.

  • DelayedArray-stats for statistical functions on DelayedArray objects.

  • matrixStats-methods for DelayedMatrix row/col summarization.

  • DelayedArray objects.

  • HDF5Array objects in the HDF5Array package.

  • S4groupGeneric in the methods package for the members of the Ops, Math, and Math2 groups.

  • sweep and scale in the base package.

  • paste2 in the BiocGenerics package.

Examples

## ---------------------------------------------------------------------
## BIND DelayedArray OBJECTS
## ---------------------------------------------------------------------
## DelayedArray objects can be bound along their 1st (rows) or 2nd
## (columns) dimension with rbind() or cbind(). These operations are
## equivalent to arbind() and acbind(), respectively, and are all
## delayed.

## On 2D objects:
library(HDF5Array)
toy_h5 <- system.file("extdata", "toy.h5", package="HDF5Array")
h5ls(toy_h5)

M1 <- HDF5Array(toy_h5, "M1")
M2 <- HDF5Array(toy_h5, "M2")

M12 <- rbind(M1, t(M2))            # delayed
M12
colMeans(M12)                      # block-processed

## On objects with more than 2 dimensions:
example(arbind)  # to create arrays a1, a2, a3

A1 <- DelayedArray(a1)
A2 <- DelayedArray(a2)
A3 <- DelayedArray(a3)
A123 <- rbind(A1, A2, A3)          # delayed
A123

## On 1D objects:
v1 <- array(11:15, 5, dimnames=list(LETTERS[1:5]))
v2 <- array(letters[1:3])
V1 <- DelayedArray(v1)
V2 <- DelayedArray(v2)
V12 <- rbind(V1, V2)
V12

## Not run: cbind(V1, V2)  # Error! (the objects to cbind() must have at least 2
               # dimensions)

## End(Not run)

## Note that base::rbind() and base::cbind() do something completely
## different on ordinary arrays that are not matrices. They treat them
## as if they were vectors:
rbind(a1, a2, a3)
cbind(a1, a2, a3)
rbind(v1, v2)
cbind(v1, v2)

## Also note that DelayedArray objects of arbitrary dimensions can be
## stored inside a DataFrame object as long as they all have the same
## first dimension (nrow()):
DF <- DataFrame(M=I(tail(M1, n=5)), A=I(A3), V=I(V1))
DF[-3, ]
DF2 <- rbind(DF, DF)
DF2$V

## Sanity checks:
m1 <- as.matrix(M1)
m2 <- as.matrix(M2)
stopifnot(identical(rbind(m1, t(m2)), as.matrix(M12)))
stopifnot(identical(arbind(a1, a2, a3), as.array(A123)))
stopifnot(identical(arbind(v1, v2), as.array(V12)))
stopifnot(identical(rbind(DF$M, DF$M), DF2$M))
stopifnot(identical(rbind(DF$A, DF$A), DF2$A))
stopifnot(identical(rbind(DF$V, DF$V), DF2$V))

## ---------------------------------------------------------------------
## MORE OPERATIONS
## ---------------------------------------------------------------------

M1 >= 0.5 & M1 < 0.75              # delayed
log(M1)                            # delayed
pmax2(M2, 0)                       # delayed

type(M2) <- "integer"              # delayed
M2

## table() is block-processed:
a4 <- array(sample(50L, 2000000L, replace=TRUE), c(200, 4, 2500))
A4 <- as(a4, "HDF5Array")
table(A4)
a5 <- array(sample(20L, 2000000L, replace=TRUE), c(200, 4, 2500))
A5 <- as(a5, "HDF5Array")
table(A5)

A4 - 2 * A5                        # delayed
table(A4 - 2 * A5)                 # block-processed

## range() is block-processed:
range(A4 - 2 * A5)
range(M1)

cmeans <- colMeans(M2)             # block-processed
sweep(M2, 2, cmeans)               # delayed
scale(M2)                          # delayed & block-processed
scale(M2, center=FALSE, scale=10)  # delayed

paste2(A3, letters[1:5])           # delayed
A6 <- add_prefix("ID", A3)         # delayed
add_suffix(A6, ".fasta")           # delayed

Bioconductor/DelayedArray documentation built on Dec. 7, 2024, 10:27 p.m.