translate.msa: Get amino acid sequences from an alignment
In rphast: Interface to 'PHAST' Software for Comparative Genomics

Description Usage Arguments Value Author(s) Examples

Get amino acid sequences from an alignment

1	translate.msa(m, one.frame = TRUE, frame = 1)

`m`	An object of type `msa` representing the alignment. The alignment is assumed to be coding sequence, already in frame.
`one.frame`	A logical value indicating whether to use the same frame for all species in the alignment, or a separate frame for each species. If `one.frame==TRUE` then every three columns of the alignment is translated into a codon, regardless of gaps within the alignment. If `one.frame==FALSE`, gaps will shift the frame in the species where they occur. In this case, the length of the seqeunces returned may not all be the same.
`frame`	An integer specifying an offset from the first column of the alignment where the coding region starts. The default 1 means start at the beginning. If `one.frame==FALSE`, frame can be a vector of integers, one for each species. Otherwise it should be a single value.

A vector of character strings representing the translated alignment. The characters are amino acid codes, with '$' representing a stop codon, and '*' denoting missing data or a codon with 1 or 2 gaps, and '-' denoting a codon with all gaps.

Melissa J. Hubisz

# here is a little portion of the SOL1 gene
seqs <- c("ATGGCGACGAAGGCCGTGTGCGTGCTGAAGGGCGACGGCCCAGTGCAGG
           GCATCATCAATTTCGAGCAGAAGGCAAGGGCTGGGACGGAGGCTTGTTT
           GCGAGGCCGCTCCCACCCGCTCGTCCCCCCGCGCACCTTTGCTAGGAGC
           GGGTCGC----CCGCCAGGC-CTCGGGGCCGCCCTGGTCCAGCGCCCGG
           TCCCGGCCCGTGCCGCCCGGTCGGTGCCTTCGCCCCCAGCGGTGCGGTG
           CCCAAGTGCTGAGTCACCGGGCGGGCCCGGGC----GCGGGGCGTGGGA
           ---------CCGAGGCCGCCGCGGG",
          "ATGGCGACGAAGGCCGTGTGCGTGCTGAAGGGCGATGGCCCAGTGCAGG
           GCATCATCAATTTCGAGCAGAAGGCAAGGGCTGGGACGGAGGCTTGTTT
           GCGAGGCCGCTCCTACCCGCTCGTCCCCCCGCGCACCTTTGCTAGGAGC
           GGGTCGC----CCGCCAGGC-CTCGGGGCTGCCCTGGTCCAGCGCCCGG
           TCCCGGCCCGTGCCGCCCGGTCGGTGCCTTCGCCCCCAGCGGTGCGGTG
           CCCAAGTGCTGAGTCACCGGGCGGGCCCGGGC----GCGGGGTGTGGGA
           ---------CCGAGGCCGCCGCGGG",
          "ATGGCGATGAAAGCGGTGTGCGTGCTGAAGGGCGACGGTCCGGTGCAGG
           GAACCATCCACTTCGAGCAGAAGGCAAGGCCCGGGGC------------
           ----------------------------------------GCGGGGCGC
           AGGCCGCGGTGACGCGGCGCACCTGTGCGGGAGCACGCCACGCCCCCG-
           CCACGGCCTGAG----------------------CCCG-----------
           -CTAAGTGCTGAGTCACC--GTGGCCTGGGGCAGGGGCTGGGCGCCGGG
           AAGCGAGGCCCGGGGC-GCCGC***")
seqs <- gsub("\\s", "", seqs) #remove whitespace from seqs
align <- msa(seqs, names=c("hg19", "panTro2", "mm9"))

translate.msa(align)
translate.msa(msa(c("NNATGGCCACG")))
translate.msa(msa(c("NNATGGCCACG")), frame=3)
translate.msa(msa(c("NNATGGCCACG", "AT--GGCCACG")))
translate.msa(msa(c("NNATGGCCACG", "AT--GGCCACG")), one.frame=FALSE)
translate.msa(msa(c("NNATGGCCACG", "AT--GGCCACG")), one.frame=FALSE, frame=c(3,1))

[1] "MATKAVCVLKGDGPVQGIINFEQKARAGTEACLRGRSHPLVPPRTFARSGS**RQ*LGAALVQRPVPARAARSVPSPPAVRCPSAESPGGPG**RGVG---PRPPR"
[2] "MATKAVCVLKGDGPVQGIINFEQKARAGTEACLRGRSYPLVPPRTFARSGS**RQ*LGAALVQRPVPARAARSVPSPPAVRCPSAESPGGPG**RGVG---PRPPR"
[3] "MAMKAVCVLKGDGPVQGTIHFEQKARPG*-----------------AGRRPR$RGAPVREHATPP*TA$*------*P----LSAES**GLGQGLGAGKRGPG*P*"
[1] "*WP"
[1] "MAT"
[1] "*WP" "**P"
[1] "*WP" "MAT"
[1] "MAT" "MAT"