View source: R/extractPanGenes.R
geneFamilies2fasta | R Documentation |
Writes specified gene families to separate fasta files.
geneFamilies2fasta( pangene.tbl, fasta.folder, out.folder, file.ext = "fasta$|faa$|fna$|fa$", verbose = TRUE )
pangene.tbl |
A table listing gene families (clusters). |
fasta.folder |
The folder containing the fasta files with all sequences. |
out.folder |
The folder to write to. |
file.ext |
The file extension to recognize the fasta files in |
verbose |
Logical to allow text ouput during processing |
The argument pangene.tbl
should be produced by extractPanGenes
in order to
contain the columns cluster
, seq_tag
and N_genomes
required by this function. The
files in fasta.folder
must have been prepared by panPrep
in order to have the proper
sequence tag information. They may contain protein sequences or DNA sequences.
If you already added the Header
and Sequence
information to pangene.tbl
these will be
used instead of reading the files in fasta.folder
, but a warning is issued.
Lars Snipen.
extractPanGenes
, writeFasta
.
# Loading clustering data in this package data(xmpl.bclst) # Finding genes in 1,..,5 genomes (all genes) all.tbl <- extractPanGenes(xmpl.bclst, N.genomes = 1:5) ## Not run: # All protein fasta files are in a folder named faa, and we write to the current folder: clusters2fasta(all.tbl, fasta.folder = "faa", out.folder = ".") # use pipe, write to folder "orfans" extractPanGenes(xmpl.bclst, N.genomes = 1)) %>% geneFamilies2fasta(fasta.folder = "faa", out.folder = "orfans") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.