paf2FASTA: Export FASTA sequences from a set of alignments reported in...
In daewoooo/SVbyEye: Visualization of genomic structural variants

paf2FASTA

R Documentation

Export FASTA sequences from a set of alignments reported in PAF formatted file.

Description

Export FASTA sequences from a set of alignments reported in PAF formatted file.

Usage

paf2FASTA(
  paf.table,
  alignment.space = "query",
  order.by = "query",
  bsgenome = NULL,
  asm.fasta = NULL,
  majority.strand = NULL,
  revcomp = NULL,
  report.longest.aln = FALSE,
  report.query.name = NULL,
  concatenate.aln = TRUE,
  fasta.save = NULL,
  return = "fasta"
)

Arguments

`paf.table`	A `data.frame` or `tibble` containing a single or multiple PAF record(s) with 12 mandatory columns along with CIGAR string defined in 'cg' column.
`alignment.space`	What alignment coordinates should be exported as FASTA, either 'query' or 'target' (Default : 'query').
`order.by`	Order alignment either by 'query' or 'target' coordinates.
`bsgenome`	A BSgenome-class object of reference genome to get the genomic sequence from.
`asm.fasta`	An assembly FASTA file to extract DNA sequence from defined PAF alignments.
`majority.strand`	A desired majority strand directionality to be reported.
`revcomp`	If set to `TRUE` FASTA sequence will be reverse complemented regardless of value defined in 'majority.strand'.
`report.longest.aln`	If set to `TRUE` only the sequence with the most aligned bases will be reported in final FASTA file.
`report.query.name`	A single query (contig) name/id to be reported as FASTA sequence.
`concatenate.aln`	Set to `TRUE` if multiple aligned contigs should be concatenated by 100 N's in to a single FASTA sequence (Default : 'TRUE').
`fasta.save`	A path to a filename where to store final FASTA file.
`return`	Set to either 'fasta' or 'index' to return either FASTA in `DNAStringSet-class` object or region index in `GRanges-class` object is returned.

Value

A DNAStringSet-class object with exported sequence

Author(s)

David Porubsky

Examples

## Get PAF to process ##
paf.file <- system.file("extdata", "test4.paf", package = "SVbyEye")
## Read in PAF
paf.table <- readPaf(paf.file = paf.file, include.paf.tags = TRUE, restrict.paf.tags = "cg")
## Get FASTA using query alignment coordinates ##
## Define assembly FASTA to get the sequence from
asm.fasta <- system.file("extdata", "test4_query.fasta", package = "SVbyEye")
paf2FASTA(paf.table = paf.table, alignment.space = "query", asm.fasta = asm.fasta)

## Get FASTA using target alignment coordinates ##
## Define BSgenome object to get the sequence from
paf2FASTA(
    paf.table = paf.table, alignment.space = "target",
    bsgenome = BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
)

daewoooo/SVbyEye documentation built on Feb. 28, 2025, 12:52 a.m.