fit_TOP_M5_model: Fits TOP model with M5 bins
In HarteminkLab/TOP: Predict transcription factor occupancy using DNase- or ATAC-seq data

fit_TOP_M5_model

R Documentation

Fits TOP model with M5 bins

Description

Fits TOP model with M5 bins. By default, it runs Gibbs sampling for all 10 partitions in parallel on 10 CPU cores, and returns a list of posterior samples for each of the 10 partitions. Alternatively, you may fit model for each of the 10 the partitions on separate machines by specifying which partition to run.

Usage

fit_TOP_M5_model(
  all_training_data,
  all_training_data_files,
  model_file,
  logistic_model = FALSE,
  transform = c("asinh", "log2", "sqrt", "none"),
  partitions = 1:10,
  n_iter = 2000,
  n_burnin = floor(n_iter/2),
  n_chains = 3,
  n_thin = max(1, floor((n_iter - n_burnin)/1000)),
  n_cores = length(partitions),
  save = TRUE,
  outdir = "TOP_fit",
  return_type = c("samples", "jagsfit", "samplefiles"),
  quiet = FALSE
)

Arguments

`all_training_data`	A list of the assembled training data of all partitions.
`all_training_data_files`	A vector of the assembled training data files of all partitions. If `all_training_data` is missing, it will load the training data from `all_training_data_files`.
`model_file`	TOP model file written in `JAGS`. By default, use the model file included in the TOP package.
`logistic_model`	Logical; whether to use the logistic version of TOP model. If `logistic_model = TRUE`, use the logistic version of TOP model. If `logistic_model = FALSE`, use the quantitative occupancy model (default).
`transform`	Type of transformation for ChIP-seq read counts. Options are: ‘asinh’(asinh transformation), ‘log2’ (log2 transformation), ‘sqrt’ (square root transformation), and ‘none’(no transformation). This only applies when `logistic_model = FALSE`.
`partitions`	A vector of selected partition(s) to run. Default: all 10 partitions. If you specify a few partitions, it will only fit models to data in those selected partitions.
`n_iter`	Number of total iterations per chain, including burn-in iterations.
`n_burnin`	Length of burn-in iterations, i.e. number of samples to discard at the beginning. Default is `n_iter/2`, discarding the first half of the samples.
`n_chains`	Number of Markov chains (default: 3).
`n_thin`	Thinning rate, must be a positive integer. Default is `max(1, floor(n_chains * (n_iter-n_burnin) / 1000))` which will only thin if there are at least 2000 simulations. No thinning will be performed if `n_thin = 1`.
`n_cores`	Number of cores to use in parallel (default: equal to the number of partitions, i.e. `length(partitions)`).
`save`	Logical, if TRUE, saves posterior samples as ‘.rds’ files in `outdir`.
`outdir`	Directory to save TOP model posterior samples.
`return_type`	Type of result to return. Options: ‘samples’(posterior samples), ‘jagsfit’ (`jagsfit` object), or ‘samplefiles’ (file names of posterior samples).
`quiet`	Logical, if TRUE, suppress model fitting messages. Otherwise, only show progress bars.

Value

A list of posterior samples or jagsfit object for each partition.

Examples

## Not run: 
# Example to train TOP quantitative occupancy model:

# The example below first performs 'asinh' transform to the ChIP-seq counts
# in 'assembled_training_data', then runs Gibbs sampling
# for each of the 10 partitions in parallel.
# The following example runs 5000 iterations of Gibbs sampling in total,
# including 1000 burn-ins, with 3 Markov chains, at a thinning rate of 2,
# and saves the posterior samples to the 'TOP_fit' directory.
all_TOP_samples <- fit_TOP_M5_model(assembled_training_data,
                                    logistic_model = FALSE,
                                    transform = 'asinh',
                                    n_iter = 5000,
                                    n_burnin = 1000,
                                    n_chains = 3,
                                    n_thin = 2,
                                    out_dir = 'TOP_fit')

# We can also obtain the posterior samples separately for each partition,
# For example, to obtain the posterior samples for partition #3 only:
TOP_samples_part3 <- fit_TOP_M5_model(assembled_training_data,
                                      logistic_model = FALSE,
                                      transform = 'asinh',
                                      partitions = 3,
                                      n_iter = 5000,
                                      n_burnin = 1000,
                                      n_chains = 3,
                                      n_thin = 2,
                                      out_dir = 'TOP_fit')


# Example to train TOP logistic (binary) model:
all_TOP_samples <- fit_TOP_M5_model(assembled_training_data,
                                    logistic_model = TRUE,
                                    n_iter = 5000,
                                    n_burnin = 1000,
                                    n_chains = 3,
                                    n_thin = 2,
                                    out_dir = 'TOP_fit')


## End(Not run)

HarteminkLab/TOP documentation built on June 11, 2025, 5:34 p.m.

HarteminkLab/TOP index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

HarteminkLab/TOP
Predict transcription factor occupancy using DNase- or ATAC-seq data

fit_TOP_M5_model: Fits TOP model with M5 bins
In HarteminkLab/TOP: Predict transcription factor occupancy using DNase- or ATAC-seq data

Fits TOP model with M5 bins

Description

Usage

Arguments

Value

Examples

Related to fit_TOP_M5_model in HarteminkLab/TOP...

R Package Documentation

Browse R Packages

We want your feedback!

HarteminkLab/TOP Predict transcription factor occupancy using DNase- or ATAC-seq data

fit_TOP_M5_model: Fits TOP model with M5 bins In HarteminkLab/TOP: Predict transcription factor occupancy using DNase- or ATAC-seq data

Fits TOP model with M5 bins

Description

Usage

Arguments

Value

Examples

Related to fit_TOP_M5_model in HarteminkLab/TOP...

R Package Documentation

Browse R Packages

We want your feedback!

HarteminkLab/TOP
Predict transcription factor occupancy using DNase- or ATAC-seq data

fit_TOP_M5_model: Fits TOP model with M5 bins
In HarteminkLab/TOP: Predict transcription factor occupancy using DNase- or ATAC-seq data