processLastAlignedBase: processLastAlignedBase

Description Usage Arguments Details Value Author(s)

Description

Sorts read pairs into two categories: facing, and fully repetitive.

Usage

1
2
3
4
5
  processLastAlignedBase(con, table_name, input_file,
    imd_dist = NULL, upper_limit = 0.95,
    margin = (read_length/2), read_length = 75,
    min_repeat_length = (read_length/2), output_dir = ".",
    overwrite = FALSE, append_chr = TRUE)

Arguments

con

database connection

table_name

name of table in database

input_file

last_aligned_base.tsv file

imd_dist

.insert_len_dist file (i.e. the inner mate distance distribution for the sample)

upper_limit

values between 0 and 1 will be used to set prob in quantile() and then used to calculate the maximum allowed distance from the repeat. Values greater than 1 will be applied as the maximum. Facing reads whose ends are farther from the repeat than the value set by this parameter will be not be used. [default= 0.95]

read_length

read length [default= 75]

output_dir

output directory to write results [default= "."]

overwrite

overwrite output files if they exist [default= FALSE]

append_chr

appends "chr" to chromosome names of inner_mate_ranges file [default= TRUE]

max_gap

maximum distance between the last aligned base and the start or end of the repeat [default = 500]

margin

maximum bp overlap between each read and a given repeat repeat [default= read_length/2]

min_repeat_length

minimum repeat length in reference genome [default= read_length/2]

Details

Description of output files: facing: one read flanks or is anchored outside the repeat (minimum = margin) while its pair aligned inside the repeat (or didn't meet minimum anchor requirements) fully_repetitive: both reads in pair aligned inside the repeat (or didn't meet minimum anchor requirements)

Value

_facing, and _fully_repetitive tsv files

Author(s)

Adam Struck - Intern


adamstruck/RECD documentation built on May 10, 2019, 5:51 a.m.