online_iNMF-deprecated | R Documentation |
Please turn to runOnlineINMF
or
runIntegration
.
Perform online integrative non-negative matrix factorization to represent multiple single-cell datasets in terms of H, W, and V matrices. It optimizes the iNMF objective function using online learning (non-negative least squares for H matrix, hierarchical alternating least squares for W and V matrices), where the number of factors is set by k. The function allows online learning in 3 scenarios: (1) fully observed datasets; (2) iterative refinement using continually arriving datasets; and (3) projection of new datasets without updating the existing factorization. All three scenarios require fixed memory independent of the number of cells.
For each dataset, this factorization produces an H matrix (cells by k), a V matrix (k by genes), and a shared W matrix (k by genes). The H matrices represent the cell factor loadings. W is identical among all datasets, as it represents the shared components of the metagenes across datasets. The V matrices represent the dataset-specific components of the metagenes.
object |
|
X_new |
List of new datasets for scenario 2 or scenario 3. Each list element should be the name of an HDF5 file. |
projection |
Perform data integration by shared metagene (W) projection (scenario 3). (default FALSE) |
W.init |
Optional initialization for W. (default NULL) |
V.init |
Optional initialization for V (default NULL) |
H.init |
Optional initialization for H (default NULL) |
A.init |
Optional initialization for A (default NULL) |
B.init |
Optional initialization for B (default NULL) |
k |
Inner dimension of factorization–number of metagenes (default 20). A value in the range 20-50 works well for most analyses. |
lambda |
Regularization parameter. Larger values penalize dataset-specific effects more strongly (ie. alignment should increase as lambda increases). We recommend always using the default value except possibly for analyses with relatively small differences (biological replicates, male/female comparisons, etc.) in which case a lower value such as 1.0 may improve reconstruction quality. (default 5.0). |
max.epochs |
Maximum number of epochs (complete passes through the data). (default 5) |
miniBatch_max_iters |
Maximum number of block coordinate descent (HALS algorithm) iterations to perform for each update of W and V (default 1). Changing this parameter is not recommended. |
miniBatch_size |
Total number of cells in each minibatch (default 5000). This is a reasonable default, but a smaller value such as 1000 may be necessary for analyzing very small datasets. In general, minibatch size should be no larger than the number of cells in the smallest dataset. |
h5_chunk_size |
Chunk size of input hdf5 files (default 1000). The chunk size should be no larger than the batch size. |
seed |
Random seed to allow reproducible results (default 123). |
verbose |
Print progress bar/messages (TRUE by default) |
liger
object with H, W, V, A and B slots set.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.