normalize: Normalize raw counts data

View source: R/preprocess.R

normalizeR Documentation

Normalize raw counts data

Description

Perform library size normalization on raw counts input. As for the preprocessing step of iNMF integration, by default we don't multiply the normalized values with a scale factor, nor do we take the log transformation. Applicable S3 methods can be found in Usage section.

normalizePeak is designed for datasets of "atac" modality, i.e. stored in ligerATACDataset. S3 method for various container object is not supported yet due to difference in architecture design.

Usage

normalize(object, ...)

## S3 method for class 'matrix'
normalize(object, log = FALSE, scaleFactor = NULL, ...)

## S3 method for class 'dgCMatrix'
normalize(object, log = FALSE, scaleFactor = NULL, ...)

## S3 method for class 'ligerDataset'
normalize(object, chunk = 1000, verbose = getOption("ligerVerbose", TRUE), ...)

## S3 method for class 'liger'
normalize(
  object,
  useDatasets = NULL,
  verbose = getOption("ligerVerbose", TRUE),
  format.type = NULL,
  remove.missing = NULL,
  ...
)

## S3 method for class 'Seurat'
normalize(object, assay = NULL, layer = "counts", save = "ligerNormData", ...)

normalizePeak(
  object,
  useDatasets = NULL,
  verbose = getOption("ligerVerbose", TRUE),
  ...
)

Arguments

object

liger object

...

Arguments to be passed to S3 methods. The "liger" method calls the "ligerDataset" method, which then calls "dgCMatrix" method. normalizePeak directly calls normalize.dgCMatrix.

log

Logical. Whether to do a log(x + 1) transform on the normalized data. Default TRUE.

scaleFactor

Numeric. Scale the normalized expression value by this factor before transformation. NULL for not scaling. Default 1e4.

chunk

Integer. Number of maximum number of cells in each chunk when working on HDF5 file based ligerDataset. Default 1000.

verbose

Logical. Whether to show information of the progress. Default getOption("ligerVerbose") or TRUE if users have not set.

useDatasets

A character vector of the names, a numeric or logical vector of the index of the datasets to be normalized. Should specify ATACseq datasets when using normalizePeak. Default NULL normalizes all valid datasets.

format.type, remove.missing

Deprecated. The functionality of these is covered through other parts of the whole workflow and is no long needed. Will be ignored if specified.

assay

Name of assay to use. Default NULL uses current active assay.

layer

Where the input raw counts should be from. Default "counts". For older Seurat, always retrieve from counts slot.

save

For Seurat>=4.9.9, the name of layer to store normalized data. Default "ligerNormData". For older Seurat, stored to data slot.

Value

Updated object.

  • dgCMatrix method - Returns processed dgCMatrix object

  • ligerDataset method - Updates the normData slot of the object

  • liger method - Updates the normData slot of chosen datasets

  • Seurat method - Adds a named layer in chosen assay (V5), or update the data slot of the chosen assay (<=V4)

  • normalizePeak - Updates the normPeak slot of chosen datasets.

Examples

pbmc <- normalize(pbmc)

rliger documentation built on Oct. 30, 2024, 1:07 a.m.