Splice junction and exon prediction from BAM files

Share:

Description

Splice junctions and exons are predicted for each sample and merged across samples. Terminal exons are filtered and trimmed, if applicable. For details, see the help pages for predictTxFeaturesPerSample, mergeTxFeatures, and processTerminalExons.

Usage

1
2
3
4
predictTxFeatures(sample_info, which = NULL, alpha = 2, psi = 0,
  beta = 0.2, gamma = 0.2, min_junction_count = NULL, min_anchor = 1,
  max_complexity = 20, min_n_sample = 1, min_overhang = NA,
  verbose = FALSE, cores = 1)

Arguments

sample_info

Data frame with sample information. Required columns are “sample_name”, “file_bam”, “paired_end”, “read_length”, “frag_length” and “lib_size”. Library information can be obtained with function getBamInfo.

which

GRanges of genomic regions to be considered for feature prediction, passed to ScanBamParam

alpha

Minimum FPKM required for a splice junction to be included. Internally, FPKMs are converted to counts, requiring arguments read_length, frag_length and lib_size. alpha is ignored if argument min_junction_count is specified.

psi

Minimum splice frequency required for a splice junction to be included

beta

Minimum relative coverage required for an internal exon to be included

gamma

Minimum relative coverage required for a terminal exon to be included

min_junction_count

Minimum fragment count required for a splice junction to be included. If specified, argument alpha is ignored.

min_anchor

Integer specifiying minimum anchor length

max_complexity

Maximum allowed complexity. If a locus exceeds this threshold, it is skipped, resulting in a warning. Complexity is defined as the maximum number of unique predicted splice junctions overlapping a given position. High complexity regions are often due to spurious read alignments and can slow down processing. To disable this filter, set to NA.

min_n_sample

Minimum number of samples a feature must be observed in to be included

min_overhang

Minimum overhang required to suppress filtering or trimming of predicted terminal exons (see the manual page for processTerminalExons). Use NULL to disable processing (disabling processing is useful if results are subsequently merged with other predictions and processing is postponed until after the merging step).

verbose

If TRUE, generate messages indicating progress

cores

Number of cores available for parallel processing

Value

TxFeatures object

Author(s)

Leonard Goldstein

Examples

1
2
3
path <- system.file("extdata", package = "SGSeq")
si$file_bam <- file.path(path, "bams", si$file_bam)
txf <- predictTxFeatures(si, gr)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.