Description Usage Arguments Details Value Author(s) Examples
View source: R/clean_alignment.r
Several lineages in the MalAvi database differ by ambiguous base calls only (e.g., "N" or "Y") and thus represent repeated haplotypes. For phylogenetic analysis it might make sense to only include one representative of any repeated haplotype because there is no way to know if they represent one or two lineages. This function identifies such repeated haplotypes in an alignment and randomly selects one of their lineages to be representative of the haplotype. Using this selection, the function subsets the alignment so that all haplotypes are only represented once.
1 | clean_alignment(alignment, separate_by_genus = FALSE, haplotype_format_wide = TRUE)
|
alignment |
a DNA sequence alignment of class |
separate_by_genus |
if the alignment is a MalAvi alignment with uncleaned sequence names (see details) you can choose to output the cleaned alignments by parasite genus by setting to |
haplotype_format_wide |
if the lineage names associated with each repeated haplotype should be in wide format ( |
In a MalAvi alignment the default sequence (i.e., lineage) names have extra information and typically begin with a letter that indicates the parasite genus. This information can be used to separate the alignments by parasite genus if separate_by_genus
is set to TRUE
.
Returns a list composed of the following elements:
repeated_haplotypes |
A data frame (in wide or long format) of repeated haplotypes and associated sequence (lineage) names |
selected_lineages |
A vector of randomly selected sequence (lineage) names chosen to represent each repeated haplotype |
alignment_clean |
A sequence alignment of class |
Vincenzo A. Ellis vincenzoaellis@gmail.com
1 2 3 4 | ## load the long seqs alignment from MalAvi then clean it
long.seqs <- extract_alignment("long seqs")
long.seqs.clean <- clean_alignment(long.seqs)
long.seqs.clean
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.