alignCodingSequencesPipeline | R Documentation |
Runs the pipeline to align a set of coding sequences: First translates them, then validates them for premature stop codons, subsequently generates a multiple sequence alignment (MSA) of amino acid (AA) sequences, then uses this AA MSA as guide and aligns the coding sequences in the final step.
alignCodingSequencesPipeline(cds, work.dir, gene.group.name)
cds |
an instance of |
work.dir |
the working directory to use and in which to save the relevant files |
gene.group.name |
a string being used to name the output files written into work.dir. Could be something like 'fam1234'. |
The ALIGNED and validated coding sequences as an instance of
base::list
as generated by seqinr::read.fasta
, or nothing if
validation discards the rest of 'cds'.
Sanitize the gene identifiers:
Convert to AA and align the AA-sequences:
Remove invalid AA-Sequences, i.e. AA-Seqs with premature stop-codons:
Warn about removed AA-Seqs:
If only a single sequence is left, we're done:
Write out the sanitized amino acid seqs:
Generate a multiple sequence alignment:
Use the aligned AA-Seqs as quide to align the CDS Sequences:
Return the CDS MSA using the ORIGINAL gene identifiers:
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.