FSCseq: Wrapper for main FSCseq function

Description Usage Arguments Value Author(s) References

View source: R/FSCseq.R

Description

Run main CEM/EM clustering and feature selection algorithm.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
FSCseq(
  ncores = 1,
  X = NULL,
  y,
  k,
  lambda = 0,
  alpha = 0,
  size_factors,
  norm_y,
  true_clusters = NULL,
  true_disc = NULL,
  init_parms = FALSE,
  init_coefs = NULL,
  init_phi = NULL,
  init_cls = NULL,
  init_wts = NULL,
  n_rinits = if (method == "EM") {     20 } else if (method == "CEM") {     1 },
  maxit_inits = if (method == "EM") {     15 } else if (method == "CEM") {     100 },
  maxit_EM = 100,
  maxit_IRLS = 50,
  maxit_CDA = 50,
  EM_tol = 1e-06,
  IRLS_tol = 1e-04,
  CDA_tol = 1e-04,
  disp = "gene",
  method = "CEM",
  init_temp = nrow(y),
  trace = F,
  trace.file = NULL,
  mb_size = NULL,
  PP_filt = 0.001
)

Arguments

ncores

integer, number of cores to utilize in parallel computing (default 1)

X

(optional) design matrix of dimension n by p

y

count matrix of dimension g by n

k

integer, number of clusters

lambda

numeric penalty parameter, lambda >= 0

alpha

numeric penalty parameter, 0 <= alpha < 1

size_factors

numeric vector of length n, factors to correct for subject-specific variation of sequencing depth

norm_y

count matrix of dimension g by n, normalized for differences in sequencing depth

true_clusters

(optional) integer vector of true groups, if available, for diagnostic tracking

true_disc

(optional) logical vector of true discriminatory genes, if available, for diagnostic tracking

init_parms

logical, TRUE: custom parameter initializations, FALSE (default): start from scratch

init_coefs

matrix of dimension g by k, only if init_parms = TRUE

init_phi

vector of dimension g (gene-specific dispersions), only if init_parms = TRUE

init_cls

(optional) vector of length n, initial clustering.

init_wts

(optional) matrix of dim k by n to denote initial clustering (allowing partial membership). If both init_cls and init_wts specified, init_wts will be ignored and init_cls used as initial clusters

n_rinits

integer, number of additional random initializations to be searched (default 1)

maxit_inits

integer, maximum number of iterations for each initialization search (default 100)

maxit_EM

integer, maximum number of iterations for full CEM/EM run (default 100)

maxit_IRLS

integer, maximum number of iterations for IRLS algorithm, in M step (default 50)

maxit_CDA

integer, maximum number of iterations for CDA loop (default 50)

EM_tol

numeric, tolerance of convergence for EM/CEM, default is 1E-6

IRLS_tol

numeric, tolerance of convergence for IRLS, default is 1E-4

CDA_tol

numeric, tolerance of convergence for IRLS, default is 1E-4

method

string, either "EM" or "CEM" (default)

init_temp

numeric, default for CEM: init_temp = nrow(y), i.e. number of genes. temp=1 for EM

trace

logical, TRUE: output diagnostic messages, FALSE (default): don't output

trace.file

(optional) string, file into which interim diagnostics will be printed

mb_size

minibatch size: # of genes to include per M step iteration

PP_filt

numeric between (0,1), threshold on PP for sample/cl to be included in M step estimation. Default is 1e-3

Value

list containing outputs from EM_run() function

Author(s)

David K. Lim, deelim@live.unc.edu

References

https://github.com/DavidKLim/FSCseq


DavidKLim/FSCseq documentation built on Dec. 12, 2021, 3:46 a.m.