geneFamilies2fasta: Write gene families to files

View source: R/extractPanGenes.R

geneFamilies2fastaR Documentation

Write gene families to files

Description

Writes specified gene families to separate fasta files.

Usage

geneFamilies2fasta(
  pangene.tbl,
  fasta.folder,
  out.folder,
  file.ext = "fasta$|faa$|fna$|fa$",
  verbose = TRUE
)

Arguments

pangene.tbl

A table listing gene families (clusters).

fasta.folder

The folder containing the fasta files with all sequences.

out.folder

The folder to write to.

file.ext

The file extension to recognize the fasta files in fasta.folder.

verbose

Logical to allow text ouput during processing

Details

The argument pangene.tbl should be produced by extractPanGenes in order to contain the columns cluster, seq_tag and N_genomes required by this function. The files in fasta.folder must have been prepared by panPrep in order to have the proper sequence tag information. They may contain protein sequences or DNA sequences.

If you already added the Header and Sequence information to pangene.tbl these will be used instead of reading the files in fasta.folder, but a warning is issued.

Author(s)

Lars Snipen.

See Also

extractPanGenes, writeFasta.

Examples

# Loading clustering data in this package
data(xmpl.bclst)

# Finding genes in 1,..,5 genomes (all genes)
all.tbl <- extractPanGenes(xmpl.bclst, N.genomes = 1:5)

## Not run: 
# All protein fasta files are in a folder named faa, and we write to the current folder:
clusters2fasta(all.tbl, fasta.folder = "faa", out.folder = ".")

# use pipe, write to folder "orfans"
extractPanGenes(xmpl.bclst, N.genomes = 1)) %>% 
  geneFamilies2fasta(fasta.folder = "faa", out.folder = "orfans")

## End(Not run)


larssnip/micropan documentation built on April 16, 2022, 8:49 p.m.