core_genome | R Documentation |
Find and creates a core-genome alignment. Unlike core_plots() this function find the hard core-genome (genes presence in 100% of genomes and without repetitions (i.e. without paralougs)). The function takes a mmseqs() output, so the definition of the orthologous genes of the core-genome (similarity, coverage and/or e-value)depends on the mmseqs() parameters.
core_genome(data, type, n_cores, method = "fast")
data |
An mmseqs object |
type |
Type of sequence 'nucl' or 'prot' |
n_cores |
Number of computer core to use |
methos |
fast (based on blast) or accurate (based on mafft) |
The function can performs a pseudo-msa per each ortholog using the function result2msa of mmseqs2.This approach is much faster than classical MSA (clutal, mafft or muscle) but is less accurate. Taking into account that most of the phylogenetic inference software only takes variant columns with no insertions or deletion, there are not to many difference in the final phylogenetic trees.
However, core_genome also implements an accurate method that use mafft to build a MSA of each gene cluster.
core_genome() can build a core-genome alignment of thousands of genomes in minutes.
A core_genome object (a data.frame with two columns: fasta header and sequence)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.