geneFamilies2fasta: Write gene families to files
In larssnip/micropan: Microbial Pan-Genome Analysis

geneFamilies2fasta

R Documentation

Write gene families to files

Description

Writes specified gene families to separate fasta files.

Usage

geneFamilies2fasta(
  pangene.tbl,
  fasta.folder,
  out.folder,
  file.ext = "fasta$|faa$|fna$|fa$",
  verbose = TRUE
)

Arguments

`pangene.tbl`	A table listing gene families (clusters).
`fasta.folder`	The folder containing the fasta files with all sequences.
`out.folder`	The folder to write to.
`file.ext`	The file extension to recognize the fasta files in `fasta.folder`.
`verbose`	Logical to allow text ouput during processing

Details

The argument pangene.tbl should be produced by extractPanGenes in order to contain the columns cluster, seq_tag and N_genomes required by this function. The files in fasta.folder must have been prepared by panPrep in order to have the proper sequence tag information. They may contain protein sequences or DNA sequences.

If you already added the Header and Sequence information to pangene.tbl these will be used instead of reading the files in fasta.folder, but a warning is issued.

Author(s)

Lars Snipen.

Examples

# Loading clustering data in this package
data(xmpl.bclst)

# Finding genes in 1,..,5 genomes (all genes)
all.tbl <- extractPanGenes(xmpl.bclst, N.genomes = 1:5)

## Not run: 
# All protein fasta files are in a folder named faa, and we write to the current folder:
clusters2fasta(all.tbl, fasta.folder = "faa", out.folder = ".")

# use pipe, write to folder "orfans"
extractPanGenes(xmpl.bclst, N.genomes = 1)) %>% 
  geneFamilies2fasta(fasta.folder = "faa", out.folder = "orfans")

## End(Not run)

larssnip/micropan documentation built on April 16, 2022, 8:49 p.m.