extractBuscos: Utility for extracting BUSCO sequences into individual fasta...
In carolinafishes/toast: Transcriptome Ortholog Alignment Sequence Tools

extractBuscos

R Documentation

Utility for extracting BUSCO sequences into individual fasta files from multiple directories.

Description

Utility for extracting BUSCO sequences into individual fasta files from multiple directories.

Usage

extractBuscos(
  tsvLocations,
  fastaLocations,
  ed,
  SampleIDs,
  complete = TRUE,
  fragmented = TRUE,
  threshold = 300,
  duplicated = TRUE
)

Arguments

`tsvLocations`	the paths that contain the tsv files from a busco analysis.
`fastaLocations`	the paths to the original fasta sequence files
`ed`	extracted directory where extracted sequences will be written
`complete`	whether or not to include complete sequences
`fragmented`	whether or not to include fragmented sequences
`threshold`	minimum number of base pairs required for a fragmented sequence to be extracted.Not used when fragmented = false
`duplicated`	whether or not to include duplicated sequences
`sampleIDs`	name of sequences

Value

This function uses the output of a busco analysis and specified fasta file to extract busco sequences and write these into fasta files. Fasta files are written into the directory specified by the parameter ed. Each sequence will be named based on the seqID specified

Author(s)

Alex Dornburg, dornburgalex@gmail.com

Phillip Souza, psouza1@uncc.edu

Examples

extractBuscos(tsvLocations=c("path/to/first/tsvfile","path/to/second/tsvfile",...), fasta=c("path/to/first/fastafile","path/to/second/fastafile",...),ed="path/to/extracted/", sampleIDs=c("Genus_species","Othergenus_otherspecies",...),complete=TRUE, fragmented=TRUE, threshold=300, duplicated=TRUE)

carolinafishes/toast documentation built on April 12, 2025, 10:41 a.m.