paf2FASTA: Export FASTA sequences from a set of alignments reported in...

View source: R/pafToFasta.R

paf2FASTAR Documentation

Export FASTA sequences from a set of alignments reported in PAF formatted file.

Description

Export FASTA sequences from a set of alignments reported in PAF formatted file.

Usage

paf2FASTA(
  paf.table,
  alignment.space = "query",
  order.by = "query",
  bsgenome = NULL,
  asm.fasta = NULL,
  majority.strand = NULL,
  revcomp = NULL,
  report.longest.aln = FALSE,
  report.query.name = NULL,
  concatenate.aln = TRUE,
  fasta.save = NULL,
  return = "fasta"
)

Arguments

paf.table

A data.frame or tibble containing a single or multiple PAF record(s) with 12 mandatory columns along with CIGAR string defined in 'cg' column.

alignment.space

What alignment coordinates should be exported as FASTA, either 'query' or 'target' (Default : 'query').

order.by

Order alignment either by 'query' or 'target' coordinates.

bsgenome

A BSgenome-class object of reference genome to get the genomic sequence from.

asm.fasta

An assembly FASTA file to extract DNA sequence from defined PAF alignments.

majority.strand

A desired majority strand directionality to be reported.

revcomp

If set to TRUE FASTA sequence will be reverse complemented regardless of value defined in 'majority.strand'.

report.longest.aln

If set to TRUE only the sequence with the most aligned bases will be reported in final FASTA file.

report.query.name

A single query (contig) name/id to be reported as FASTA sequence.

concatenate.aln

Set to TRUE if multiple aligned contigs should be concatenated by 100 N's in to a single FASTA sequence (Default : 'TRUE').

fasta.save

A path to a filename where to store final FASTA file.

return

Set to either 'fasta' or 'index' to return either FASTA in DNAStringSet-class object or region index in GRanges-class object is returned.

Value

A DNAStringSet-class object with exported sequence

Author(s)

David Porubsky

Examples

## Get PAF to process ##
paf.file <- system.file("extdata", "test4.paf", package = "SVbyEye")
## Read in PAF
paf.table <- readPaf(paf.file = paf.file, include.paf.tags = TRUE, restrict.paf.tags = "cg")
## Get FASTA using query alignment coordinates ##
## Define assembly FASTA to get the sequence from
asm.fasta <- system.file("extdata", "test4_query.fasta", package = "SVbyEye")
paf2FASTA(paf.table = paf.table, alignment.space = "query", asm.fasta = asm.fasta)

## Get FASTA using target alignment coordinates ##
## Define BSgenome object to get the sequence from
paf2FASTA(
    paf.table = paf.table, alignment.space = "target",
    bsgenome = BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38
)


daewoooo/SVbyEye documentation built on Oct. 15, 2024, 6:12 a.m.