Description Usage Arguments Value Author(s)
sc_bpr_cluster_wrap
is a wrapper function that clusters single-cells
based on their DNA methylation profiles using the EM algorithm, where the
observation model is the Binomial/Bernoulli distributed Probit Regression
likelihood. Initially, it performs parameter checking, runs a 'mini' EM to
fnd the optimal starting parameter values, and then the EM algorithm is
applied and finally model selection metrics are calculated, such as BIC and
AIC.
1 2 3 4 5 | sc_bpr_cluster_wrap(x, K = 2, pi_k = NULL, w = NULL, basis = NULL,
lambda = 1/8, em_max_iter = 100, epsilon_conv = 1e-05,
use_kmeans = TRUE, em_init_nstart = 10, em_init_max_iter = 10,
opt_method = "CG", opt_itnmax = 50, init_opt_itnmax = 100,
is_parallel = TRUE, no_cores = NULL, is_verbose = FALSE)
|
x |
A list of length I, where I are the total number of cells. Each element of the list contains another list of length N, where N is the total number of genomic regions. Each element of the inner list is an L x 2 matrix of observations, where 1st column contains the locations and the 2nd column contains the methylation level of the corresponding CpGs. |
K |
Integer denoting the number of clusters K. |
pi_k |
Vector of length K, denoting the mixing proportions. |
w |
A N x M x K array, where each column contains the basis function coefficients for the corresponding cluster. |
basis |
A 'basis' object. E.g. see |
lambda |
The complexity penalty coefficient for ridge regression. |
em_max_iter |
Integer denoting the maximum number of EM iterations. |
epsilon_conv |
Numeric denoting the convergence parameter for EM. |
use_kmeans |
Logical, use k-means for initializing centres or randmoly picking a point a cluster centre. |
em_init_nstart |
Number of EM random starts for finding optimal likelihood. |
em_init_max_iter |
Maximum number of EM iterations for the 'small' init EM. |
opt_method |
The optimization method to be used. See
|
opt_itnmax |
Optional argument giving the maximum number of iterations
for the corresponding method. See |
init_opt_itnmax |
Optimization iterations for obtaining the initial EM parameter values. |
is_parallel |
Logical, indicating if code should be run in parallel. |
no_cores |
Number of cores to be used, default is max_no_cores - 2. |
is_verbose |
Logical, print results during EM iterations |
A 'sc_bpr_cluster' object which, in addition to the input parameters, consists of the following variables:
pi_k
: Fitted
mixing proportions.
w
: A N x M x K array matrix with the
fitted coefficients of the basis functions for each cluster k and region
n.
NLL
: The Negative Log Likelihood after the EM algorithm
has finished.
post_prob
: Posterior probabilities of each cell
belonging to each cluster.
labels
: Hard clustering
assignments of each cell.
BIC
: Bayesian Information Criterion
metric.
AIC
: Akaike Information Criterion metric.
ICL
: Integrated Complete Likelihood criterion metric.
C.A.Kapourani C.A.Kapourani@ed.ac.uk
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.