runCGSModels | R Documentation |
Run Latent Dirichlet Allocation with a Collapsed Gibbs Sampler in a given cisTopic object.
runCGSModels(
object,
topic = c(2, 10, 20, 30, 40, 50),
nCores = 1,
seed = 123,
iterations = 500,
burnin = 250,
alpha = 50,
alphaByTopic = TRUE,
beta = 0.1,
returnType = "allModels",
addModels = TRUE,
tmp = NULL,
...
)
object |
Initialized cisTopic object. |
topic |
Integer or vector of integers indicating the number of topics in the model/s (by default it is a vector with 2, 10, 20, 30, 40 and 50 topics). We recommend to try several values if possible, and select the best model based on the highest likelihood. |
nCores |
Number of cores to use. By default it is 1, but if several models with distinct number of topics are being tested; it is recommended to increase it to the number of models tested (or capacity of the machine). Parellelization is done with snow. |
seed |
Seed for the assignment initialization for making results reproducible. |
iterations |
Number of iterations over the data set. By default, 500 iterations are taken. However, we advise to use logLikelihoodByIter to check whether the log likelihood of the model is stabilized with this parameters. |
burnin |
Number of iterations to discard from the assingment counting. By default, 250 iterations are discarded. This number has to be lower than the number of iterations. |
alpha |
Scalar value indicating the (symmetric) Dirichlet hyperparameter for topic proportions. By default, it is set to 50. |
alphaByTopic |
Logical indicating whether the scalar given in alpha has to be divided by the number of topics. By default, it is set to true. |
beta |
Scalar value indicating the (symmetric) Dirichlet hyperparameter for topic multinomilas. By default, it is set to 0.1. |
returnType |
Defines what has to be returned to the cisTopic object: either 'allModels' or 'selectedModel'. 'allModels' will return a list with all the fitted models (as lists) to object@models, while 'selectedModel' will return the model with the best log likelihood to object@selected.model, and a dataframe with the log likelihood of the other models to object@log.lik. By default, this function will return all models for allowing posterior selection; however, note that if the number of models and the size of the data is considerably big, returning all models may be memory expensive. |
addModels |
Whether models should be added if there is a pre-existing list of models or should be overwritten by new models. If TRUE, parameters are setted to match the existing models. |
tmp |
Folder to save intermediate models. |
... |
See |
The selected parameters are adapted from Griffiths & Steyvers (2004).
Returns a cisTopic object with the models stored in object@models. If specified, only the best model based on log likelihood is returned in object@selected.model, and the rest of log likelihood values are stored in object@log.lik.
bamfiles <- c('example_1.bam', 'example_2.bam', 'example_3.bam')
regions <- 'example.bed'
cisTopicObject <- createcisTopicObjectfromBAM(bamfiles, regions)
cisTopicObject <- runCGSModels(cisTopicObject)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.