celdaGridSearch: Run Celda in parallel with multiple parameters

Description Usage Arguments Value See Also Examples

View source: R/celdaGridSearch.R

Description

Run Celda with different combinations of parameters and multiple chains in parallel. The variable 'availableModels' contains the potential models that can be utilized. Different parameters to be tested should be stored in a list and passed to the argument 'paramsTest'. Fixed parameters to be used in all models, such as 'sampleLabel', can be passed as a list to the argument 'paramsFixed'. When 'verbose = TRUE', output from each chain will be sent to a log file but not be displayed in stdout.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
celdaGridSearch(
  counts,
  model,
  paramsTest,
  paramsFixed = NULL,
  maxIter = 200,
  nchains = 3,
  cores = 1,
  bestOnly = TRUE,
  seed = 12345,
  perplexity = TRUE,
  verbose = TRUE,
  logfilePrefix = "Celda"
)

Arguments

counts

Integer matrix. Rows represent features and columns represent cells.

model

Celda model. Options available in 'celda::availableModels'.

paramsTest

List. A list denoting the combinations of parameters to run in a celda model. For example, 'list(K = seq(5, 10), L = seq(15, 20))' will run all combinations of K from 5 to 10 and L from 15 to 20 in model 'celda_CG()'.

paramsFixed

List. A list denoting additional parameters to use in each celda model. Default NULL.

maxIter

Integer. Maximum number of iterations of sampling to perform. Default 200.

nchains

Integer. Number of random cluster initializations. Default 3.

cores

Integer. The number of cores to use for parallel estimation of chains. Default 1.

bestOnly

Logical. Whether to return only the chain with the highest log likelihood per combination of parameters or return all chains. Default TRUE.

seed

Integer. Passed to with_seed. For reproducibility, a default value of 12345 is used. Seed values seq(seed, (seed + nchains - 1)) will be supplied to each chain in nchains If NULL, no calls to with_seed are made.

perplexity

Logical. Whether to calculate perplexity for each model. If FALSE, then perplexity can be calculated later with 'resamplePerplexity()'. Default TRUE.

verbose

Logical. Whether to print log messages during celda chain execution. Default TRUE.

logfilePrefix

Character. Prefix for log files from worker threads and main process. Default "Celda".

Value

Object of class 'celdaList', which contains results for all model parameter combinations and summaries of the run parameters

See Also

'celda_G()' for feature clustering, 'celda_C()' for clustering of cells, and 'celda_CG()' for simultaneous clustering of features and cells. 'subsetCeldaList()' can subset the 'celdaList' object. 'selectBestModel()' can get the best model for each combination of parameters.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
## Not run: 
data(celdaCGSim)
## Run various combinations of parameters with 'celdaGridSearch'
celdaCGGridSearchRes <- celdaGridSearch(celdaCGSim$counts,
  model = "celda_CG",
  paramsTest = list(K = seq(4, 6), L = seq(9, 11)),
  paramsFixed = list(sampleLabel = celdaCGSim$sampleLabel),
  bestOnly = TRUE,
  nchains = 1,
  cores = 1
)

## End(Not run)

celda documentation built on June 9, 2020, 2 a.m.