HMMRATAC: Perform HMMRATAC

Description Usage Arguments Value Examples

View source: R/HMMRATAC.R

Description

Perform HMMRATAC

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
MMRATAC <- function(bam,
                     index,
                     genome,
                     means = c(50, 200, 400, 600),
                     stddev = c(20, 20, 20, 20),
                     fragem = TRUE,
                     minmapq = 30,
                     upper = 20,
                     lower = 10,
                     zscore = 100,
                     zscoreprescan = 0,
                     output_dir = tempdir(),
                     blacklist = tempfile(fileext=".bed"),
                     peaks = TRUE,
                     kmeans = 3,
                     training,
                     bedgraph = FALSE,
                     minlen = 200,
                     score = c("max", "ave", "med", "fc", "zscore", "all"),
                     bgscore = FALSE,
                     trim = 0,
                     window = 25000000,
                     model,
                     modelonly = FALSE)

Arguments

bam

Sorted BAM file containing the ATAC-seq reads.

index

Index file for the sorted BAM File.

genome

Two column, tab delimited file containing genome size stats.

means

numeric vector of initial mean values for the fragment distribution. Default = c(50,200,400,600).

stddev

numeric vector of initial standard deviation values for fragment distribution. Default = c(20,20,20,20).

fragem

logical(1) of whether to perform EM training on the fragment distribution. Default = TRUE.

minmapq

numeric(1) of ,inimum mapping quality of reads to keep. Default = 30.

upper

numeric(1) of Upper limit on fold change range for choosing training sites. Default = 20.

lower

numeric(1) of lower limit on fold change range for choosing training sites. Default = 10.

zscore

numeric(1) of Zscored read depth to mask during Viterbi decoding. Default = 100.

zscoreprescan

numeric(1) of Minimum zscored read depth to be included in Viterbi decoding. Default = 0.

output

character(1) base name (including directory path) of output files. Default: file.path(tempfile(), basename(bam)).

blacklist

character(1) of name of bed file of blacklisted regions to exclude.

peaks

logical(1) of whether to report peaks in bed format. Default = TRUE.

kmeans

numeric(1) of number of states in model. Default = 3. If not k=3, recommend NOT calling peaks, use bedgraph.

training

character(1) of name of BED file of training regions to use for training model, instead of foldchange settings.

bedgraph

logical(1) Whether to report whole genome bedgraph of all state anntations. Default = FALSE.

minlen

numeric(1) of minimum length of open region to call peak. Note: peaks must be set. Default = 200.

score

character(1) either "max", "ave", "med", "fc", "zscore", "all". What type of score system to use for peaks. Can be used for ranking peaks. Default = max.

bgscore

logical(1) of whether to add the HMMR score to each state annotation in bedgraph. Note: this adds considerable time. Default = FALSE.

trim

numeric(1) of how many signals from the end to trim off (ie starting with tri signal then di etc). This may be useful if your data doesn't contain many large fragments. Default = 0.

window

numeric(1) of size of the bins to split the genome into for Viterbi decoding. To save memory, the genome is split into <int> long bins and viterbi decoding occurs across each bin. Default = 25000000. Note: For machines with limited memory, it is recomended to reduce the size of the bins.

model

character(1) of the name of a binary model file (generated from previous HMMR run) to use instead of creating new one.

modelonly

logical(1) of whether or not to stop the program after generating model. Default = false.

overwrite

logical(1) overwrite existing 'output' files?

verbose

logical(1) report progress?

Value

matrix A matrix(?)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
example_data <- system.file(package="HMMRATACData", "data")
example_files <- dir(example_data, recursive = TRUE, full = TRUE)
basename(example_files)

output_directory <- tempfile()
dir.create(output_directory)
output <- file.path(
    output_directory,
    sub(".bam$", "", basename(example_files)[[1]])
)
output

outputs <- HMMRATAC(
   bam = example_files[[1]],
   index = example_files[[2]],
   genome = example_files[[3]],
   output = output,
   window = 2500000
)
outputs

Bioconductor/HMMRATAC documentation built on July 31, 2020, 3:07 p.m.