tx_makeDT_nucFreq: Summarized Nucleotide Frequency data.table

View source: R/tx_core.R

tx_makeDT_nucFreqR Documentation

Summarized Nucleotide Frequency data.table

Description

This function constructs a list of data.tables that contains nucleotide frequency metrics per nucleotide by transcript:

  • A = Adenine

  • C = Cytosine

  • G = Guanine

  • T = Thymine

  • N = Undetermined nucleotide

  • - = Deletion

  • . = Insert, not read gap between read1 and read2

The function requires the input of a GRangesList object output by the tx_reads function, which should contain sequence alignments in the transcriptomic space, and a gene annotation in GRanges format, as loaded by the tx_load_bed function.

Usage

tx_makeDT_nucFreq(
  x,
  geneAnnot,
  genome = NULL,
  simplify_IUPAC = "splitForceInt",
  fullDT = FALSE,
  nCores = 1
)

Arguments

x

CompressedGRangesList. Genomic Ranges list containing genomic alignments data by gene. Constructed via the tx_reads function.

geneAnnot

GRanges. Gene annotation as loaded by tx_load_bed()

genome

list. The full reference genome sequences, as loaded by tx_load_genome() or prepackaged by BSgenome, see ?BSgenome::available.genomes

simplify_IUPAC

string. Available options are :

  • "not": Will output the complete nucleotide frequency table including ambiguous reads using the IUPAC ambiguity code. See: IUPAC_CODE_MAP

  • "splitForceInt" (Default): Will force an integers split in which ambiguous codes will be split and assigned half the frequency into their respective nucleotides, if the frequency is an odd number the uneven count will be assigned as "N".

  • "splitHalf": Ambiguous nucleotide frequencies will be split in half to their corresponding nucleotides, in cases where frequency is odd creating non-integer frequencies.

fullDT

logical. Set to TRUE if it is desired to output a data.table with all genes and in the same order as 'geneAnnot' object.

nCores

integer. Number of cores to run the function with. Multicore capability not available in Windows OS.

Details

This function allows for usage of multiple cores to reduce processing times in UNIX-like OS.

Value

data.table

Author(s)

M.A. Garcia-Campos

See Also

Other makeDT functions: tx_makeDT_covNucFreq(), tx_makeDT_coverage()


AngelCampos/txtools documentation built on Sept. 16, 2024, 10:25 p.m.