paf2nucleotideContent: Add FASTA sequence content to PAF alignments.
In daewoooo/SVbyEye: Visualization of genomic structural variants

paf2nucleotideContent

R Documentation

Add FASTA sequence content to PAF alignments.

Description

This function takes a PAF table and for each alignment (rows) will report counts and frequencies of user defined ‘sequence.pattern' (such as exact DNA pattern, e.g. ’GA') or 'nucleotide.content' (such as sequence GC content).

Usage

paf2nucleotideContent(
  paf.table = NULL,
  asm.fasta = NULL,
  alignment.space = NULL,
  sequence.pattern = NULL,
  nucleotide.content = NULL
)

Arguments

`paf.table`	A `data.frame` or `tibble` containing a single or multiple PAF record(s) with 12 mandatory columns along with CIGAR string defined in 'cg' column.
`asm.fasta`	An assembly FASTA file to extract DNA sequence from defined PAF alignments.
`alignment.space`	What alignment coordinates should be exported as FASTA, either 'query' or 'target' (Default : 'query').
`sequence.pattern`	A user defined DNA sequence pattern which occurrences will be counted in submitted 'fasta.seq'. (e.g. set ‘sequence.pattern' to ’GA' to obtain counts of all 'GA' occurrences per FASTA sequence).
`nucleotide.content`	A user defined nucleotides which total content will be counted in submitted 'fasta.seq'. (e.g. to obtain 'GC' content set ‘nucleotide.content' to ’GC')

Value

A tibble of PAF alignments with extra sequence content columns

Author(s)

David Porubsky

Examples

## Get PAF to process ##
paf.file <- system.file("extdata", "test_getFASTA.paf", package = "SVbyEye")
## Read in PAF
paf.table <- readPaf(paf.file = paf.file, include.paf.tags = TRUE, restrict.paf.tags = "cg")
## Split PAF alignments into user defined bins
paf.table <- pafToBins(paf.table = paf.table, binsize = 10000)
## Get FASTA to process
asm.fasta <- system.file("extdata", "test_getFASTA_query.fasta", package = "SVbyEye")
## Add sequence and nucleotide content to submitted paf.table
paf2nucleotideContent(
    paf.table = paf.table, asm.fasta = asm.fasta,
    alignment.space = "query", sequence.pattern = "GA"
)

daewoooo/SVbyEye documentation built on Feb. 28, 2025, 12:52 a.m.