optimizeNewData: Perform factorization for new data

View source: R/rliger.R

optimizeNewDataR Documentation

Perform factorization for new data

Description

Uses an efficient strategy for updating that takes advantage of the information in the existing factorization. Assumes that selected genes (var.genes) are represented in the new datasets.

Usage

optimizeNewData(
  object,
  new.data,
  which.datasets,
  add.to.existing = TRUE,
  lambda = NULL,
  thresh = 1e-04,
  max.iters = 100,
  verbose = TRUE
)

Arguments

object

liger object. Should call optimizeALS before calling.

new.data

List of raw data matrices (one or more). Each list entry should be named.

which.datasets

List of datasets to append new.data to if add.to.existing is true. Otherwise, the most similar existing datasets for each entry in new.data.

add.to.existing

Add the new data to existing datasets or treat as totally new datasets (calculate new Vs?) (default TRUE)

lambda

Regularization parameter. By default, this will use the lambda last used with optimizeALS.

thresh

Convergence threshold. Convergence occurs when |obj0-obj|/(mean(obj0,obj)) < thresh (default 1e-4).

max.iters

Maximum number of block coordinate descent iterations to perform (default 100).

verbose

Print progress bar/messages (TRUE by default)

Value

liger object with H, W, and V slots reset. Raw.data, norm.data, and scale.data will also be updated to include the new data.

Examples

ligerex <- createLiger(list(ctrl = ctrl, stim = stim))
ligerex <- normalize(ligerex)
ligerex <- selectGenes(ligerex)
ligerex <- scaleNotCenter(ligerex)

# Assume we are performing the factorization
# Specification for minimal example test time, not converging
ligerex <- optimizeALS(ligerex, k = 5, max.iters = 1)
# Suppose we have new data, namingly Y_new and Z_new from the same cell type.
# Add it to existing datasets.
new_data <- list(Y_set = ctrl, Z_set = stim)
# 2 iters do not lead to converge, it's for minimal test time
ligerex2 <- optimizeNewData(ligerex, new.data = new_data,
                            which.datasets = list('ctrl', 'stim'),
                            max.iters = 1)
# acquire new data from different cell type (X), we'll just add another dataset
# it's probably most similar to ctrl
X <- ctrl
# 2 iters do not lead to converge, it's for minimal test time
ligerex3 <- optimizeNewData(ligerex, new.data = list(x_set = X),
                            which.datasets = list('ctrl'),
                            add.to.existing = FALSE,
                            max.iters = 1)


rliger documentation built on Nov. 9, 2023, 1:07 a.m.