sc_bpr_cluster_wrap: Cluster single cells based on methylation profiles

Description Usage Arguments Value Author(s)

Description

sc_bpr_cluster_wrap is a wrapper function that clusters single-cells based on their DNA methylation profiles using the EM algorithm, where the observation model is the Binomial/Bernoulli distributed Probit Regression likelihood. Initially, it performs parameter checking, runs a 'mini' EM to fnd the optimal starting parameter values, and then the EM algorithm is applied and finally model selection metrics are calculated, such as BIC and AIC.

Usage

1
2
3
4
5
sc_bpr_cluster_wrap(x, K = 2, pi_k = NULL, w = NULL, basis = NULL,
  lambda = 1/8, em_max_iter = 100, epsilon_conv = 1e-05,
  use_kmeans = TRUE, em_init_nstart = 10, em_init_max_iter = 10,
  opt_method = "CG", opt_itnmax = 50, init_opt_itnmax = 100,
  is_parallel = TRUE, no_cores = NULL, is_verbose = FALSE)

Arguments

x

A list of length I, where I are the total number of cells. Each element of the list contains another list of length N, where N is the total number of genomic regions. Each element of the inner list is an L x 2 matrix of observations, where 1st column contains the locations and the 2nd column contains the methylation level of the corresponding CpGs.

K

Integer denoting the number of clusters K.

pi_k

Vector of length K, denoting the mixing proportions.

w

A N x M x K array, where each column contains the basis function coefficients for the corresponding cluster.

basis

A 'basis' object. E.g. see create_rbf_object

lambda

The complexity penalty coefficient for ridge regression.

em_max_iter

Integer denoting the maximum number of EM iterations.

epsilon_conv

Numeric denoting the convergence parameter for EM.

use_kmeans

Logical, use k-means for initializing centres or randmoly picking a point a cluster centre.

em_init_nstart

Number of EM random starts for finding optimal likelihood.

em_init_max_iter

Maximum number of EM iterations for the 'small' init EM.

opt_method

The optimization method to be used. See optim for possible methods. Default is "CG".

opt_itnmax

Optional argument giving the maximum number of iterations for the corresponding method. See optim for details.

init_opt_itnmax

Optimization iterations for obtaining the initial EM parameter values.

is_parallel

Logical, indicating if code should be run in parallel.

no_cores

Number of cores to be used, default is max_no_cores - 2.

is_verbose

Logical, print results during EM iterations

Value

A 'sc_bpr_cluster' object which, in addition to the input parameters, consists of the following variables:

Author(s)

C.A.Kapourani C.A.Kapourani@ed.ac.uk


andreaskapou/BPRMeth-devel documentation built on May 12, 2019, 3:32 a.m.