scale: Scale / normalize an HDF5Matrix

View source: R/S3_aggregations.R

scaleR Documentation

Scale / normalize an HDF5Matrix

Description

Block-wise centering and scaling equivalent to base R scale(). The computation runs entirely on disk — the full matrix is never loaded into RAM.

Usage

## S3 method for class 'HDF5Matrix'
scale(
  x,
  center = TRUE,
  scale = TRUE,
  byrows = FALSE,
  wsize = NULL,
  result_path = NULL,
  compression = NULL,
  ...
)

Arguments

x

An HDF5Matrix object.

center

Logical (or numeric vector, see Details). If TRUE (default) subtract column means before scaling.

scale

Logical (or numeric vector, see Details). If TRUE (default) divide by column standard deviations.

byrows

Logical. If TRUE normalize row-wise instead of column-wise. Default FALSE.

wsize

Integer or NULL. Block size for HDF5 reads (NULL = auto).

result_path

Output location. NULL (default) writes to "NORMALIZED/<group>/<dataset>" in the same file. A character string writes to that path in the same file. A named list list(file=, path=) writes to a different file.

compression

Integer (0-9) or NULL. gzip compression level for the result datasets. NULL uses the global option set by hdf5matrix_options (default 6). Use 0 to disable compression (faster for benchmarks).

...

Ignored (for S3 compatibility).

Details

Passing a pre-computed numeric vector as center or scale is not currently supported. If a vector is supplied it is coerced to a logical (TRUE if length(x) > 0) and a warning is issued.

The returned HDF5Matrix carries scaled:center and scaled:scale attributes (numeric vectors), mirroring the behavior of base::scale().

Value

An HDF5Matrix pointing to the normalized dataset on disk.

Examples


tmp <- tempfile(fileext = ".h5")
X   <- hdf5_create_matrix(tmp, "data/M",
                           data = matrix(rnorm(500), 50, 10))
Xs  <- scale(X)                         # center=TRUE, scale=TRUE by cols
cat("scaled:center[1]:", attr(Xs, "scaled:center")[1], "\n")
X$close(); Xs$close(); unlink(tmp)



BigDataStatMeth documentation built on May 15, 2026, 1:07 a.m.