Description Usage Arguments Value
View source: R/data_integration_sketched.R
Solve the model parameters through Iterative Nonnegative Matrix Factorization (iNMF), by minimizing the sketched objective function
1/\tilde{N} ∑_j||SX_j -(SH_JW^TΛ_j + S1_nj b_j^T)||_F^2 + gamma ∑_{l=1}^p(∑_{j=1}^m\tilde{n}_j/\tilde{N} λ_{jl}-1)^2
, with additional penalty for SPP.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16  | CFITIntegrate_sketched(
  X.list,
  r = 15,
  max.niter = 100,
  nrep = 1,
  init = NULL,
  subsample.prop = NULL,
  weight.list = NULL,
  tol = 1e-06,
  early.stopping = 50,
  time.out = 60 * 2,
  future.plan = c("sequential", "transparent", "multicore", "multisession", "cluster"),
  workers = parallel::detectCores() - 1,
  verbose = T,
  seed = 0
)
 | 
X.list | 
 a list of m ncells-by-ngenes, gene expression matrices from m data sets  | 
r | 
 scalar, dimension of common factor matrix, which can be chosen as the rough number of identifiable cells types in the joint population (default 15).  | 
max.niter | 
 integer, max number of iterations (default 100).  | 
nrep | 
 integer, number of repeated runs (to reduce effect of local optimum, default 1)  | 
init | 
 a list of parameters for parameter initialization. The list either contains all parameter sets: W,lambda.list, b.list, H.list, or only W will be used if provided (default NULL).  | 
subsample.prop | 
 a scalar between 0 and 1. smaller proportion with results in fast computation but less
accurate results. By default the value is set to    | 
weight.list | 
 weights for performing weighted subsampling sketching. Note that the weight.list is a list of weights per batch. The weights for each batch is a vector of nonnegative values of the same size as the number of cells in the batch.  | 
tol | 
 numeric scalar, tolerance used in stopping criteria (default 1e-5).  | 
early.stopping | 
 Stop early if no improvement of objective function for this number of iterations.  | 
time.out | 
 Stop after the number of minutes running.  | 
future.plan | 
 plan for future parallel computation, can be chosen from 'sequential','transparent','multicore','multisession' and 'cluster'. Default is 'sequential'. Note that Rstudio does not support 'multicore'.  | 
workers | 
 additional parameter for   | 
verbose | 
 boolean scalar, whether to show extensive program logs (default TRUE)  | 
seed | 
 random seed used (default 0)  | 
a list containing
ngenes-by-r numeric matrix, estimated common factor matrix
A list of m factor loading matrix of size ncells-by-r, estimated factor loading matrices
A list of estimated shift vector of size p (ngenes).
A list of estimated scaling vector of size p (ngenes).
boolean, whether the algorithm converge
numeric scalar, value of the objective function at convergence or when maximum iteration achieved
a numeric vector, value of the objective function per iteration
numeric, the relative change in W (common factor matrix) measured by Frobenious norm
a vector of numeric values, the relative change in W (common factor matrix) per iteration.
integer, the iteration at convergence (or maximum iteration if not converge)
list of parameters used for the algorithm: max.iter, tol, nrep, subsample.prop, weight.list
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.