Description Usage Arguments Value Author(s) References
Performs clustering, feature selection, and estimation of parameters using a finite mixture model of negative binomials
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | EM_run(
ncores,
X = NA,
y,
k,
lambda = 0,
alpha = 0,
size_factors = rep(1, times = ncol(y)),
norm_y = y,
true_clusters = NA,
true_disc = NA,
init_parms = FALSE,
init_coefs = matrix(0, nrow = nrow(y), ncol = k),
init_phi = matrix(0, nrow = nrow(y), ncol = k),
init_cls = NULL,
init_wts = NULL,
CEM = T,
init_Tau = nrow(y),
maxit_EM = 100,
maxit_IRLS = 50,
maxit_CDA = 50,
EM_tol = 1e-06,
IRLS_tol = 1e-04,
CDA_tol = 1e-04,
disp,
trace = F,
mb_size = NULL,
PP_filt
)
|
ncores |
integer, number of cores to utilize in parallel computing (default 1) |
X |
design matrix of dimension n by p |
y |
count matrix of dimension g by n |
k |
integer, number of clusters |
lambda |
numeric penalty parameter, lambda >= 0 |
alpha |
numeric penalty parameters, 0 <= alpha < 1 |
size_factors |
numeric vector of length n, factors to correct for subject-specific variation of sequencing depth |
norm_y |
count matrix of dimension g by n, normalized for differences in sequencing depth |
true_clusters |
(optional) integer vector of true groups, if available, for diagnostic tracking |
true_disc |
(optional) logical vector of true discriminatory genes, if available, for diagnostic tracking |
init_parms |
logical, TRUE: custom parameter initializations, FALSE (default): start from scratch |
init_coefs |
matrix of dimension g by k, only if init_parms = TRUE |
init_phi |
vector of dimension g (gene-specific dispersions) or matrix of dimension g by k (cluster-specific dispersions), only if init_parms = TRUE |
init_cls |
vector of length n, initial clustering. |
init_wts |
matrix of dim k x n: denotes cluster memberships, but can have partial membership. init_wts or init_cls must be initialized |
CEM |
logical, TRUE for CEM (default), FALSE for EM |
init_Tau |
numeric, initial temperature for CEM. Default is g for CEM (set to 1 for EM) |
maxit_EM |
integer, maximum number of iterations for full CEM/EM run (default 100) |
maxit_IRLS |
integer, maximum number of iterations for IRLS loop, in M step (default 50) |
maxit_CDA |
integer, maximum number of iterations for CDA loop (default is 50) |
EM_tol |
numeric, tolerance of convergence for EM/CEM, default is 1E-6 |
IRLS_tol |
numeric, tolerance of convergence for IRLS, default is 1E-4 |
CDA_tol |
numeric, tolerance of convergence for CDA, default is 1E-4 |
disp |
string, either "gene" (default) or "cluster" |
trace |
logical, TRUE: output diagnostic messages, FALSE (default): don't output |
mb_size |
minibatch size: # of genes to include per M step iteration |
PP_filt |
numeric between (0,1), threshold on PP for sample/cl to be included in M step estimation. Default is 1e-3 |
FSCseq object with clustering results, posterior probabilities of cluster membership, and cluster-discriminatory status of each gene
David K. Lim, deelim@live.unc.edu
https://github.com/DavidKLim/FSCseq
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.