translate.msa: Get amino acid sequences from an alignment

Description Usage Arguments Value Author(s) Examples

View source: R/msa.R

Description

Get amino acid sequences from an alignment

Usage

1
translate.msa(m, one.frame = TRUE, frame = 1)

Arguments

m

An object of type msa representing the alignment. The alignment is assumed to be coding sequence, already in frame.

one.frame

A logical value indicating whether to use the same frame for all species in the alignment, or a separate frame for each species. If one.frame==TRUE then every three columns of the alignment is translated into a codon, regardless of gaps within the alignment. If one.frame==FALSE, gaps will shift the frame in the species where they occur. In this case, the length of the seqeunces returned may not all be the same.

frame

An integer specifying an offset from the first column of the alignment where the coding region starts. The default 1 means start at the beginning. If one.frame==FALSE, frame can be a vector of integers, one for each species. Otherwise it should be a single value.

Value

A vector of character strings representing the translated alignment. The characters are amino acid codes, with '$' representing a stop codon, and '*' denoting missing data or a codon with 1 or 2 gaps, and '-' denoting a codon with all gaps.

Author(s)

Melissa J. Hubisz

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# here is a little portion of the SOL1 gene
seqs <- c("ATGGCGACGAAGGCCGTGTGCGTGCTGAAGGGCGACGGCCCAGTGCAGG
           GCATCATCAATTTCGAGCAGAAGGCAAGGGCTGGGACGGAGGCTTGTTT
           GCGAGGCCGCTCCCACCCGCTCGTCCCCCCGCGCACCTTTGCTAGGAGC
           GGGTCGC----CCGCCAGGC-CTCGGGGCCGCCCTGGTCCAGCGCCCGG
           TCCCGGCCCGTGCCGCCCGGTCGGTGCCTTCGCCCCCAGCGGTGCGGTG
           CCCAAGTGCTGAGTCACCGGGCGGGCCCGGGC----GCGGGGCGTGGGA
           ---------CCGAGGCCGCCGCGGG",
          "ATGGCGACGAAGGCCGTGTGCGTGCTGAAGGGCGATGGCCCAGTGCAGG
           GCATCATCAATTTCGAGCAGAAGGCAAGGGCTGGGACGGAGGCTTGTTT
           GCGAGGCCGCTCCTACCCGCTCGTCCCCCCGCGCACCTTTGCTAGGAGC
           GGGTCGC----CCGCCAGGC-CTCGGGGCTGCCCTGGTCCAGCGCCCGG
           TCCCGGCCCGTGCCGCCCGGTCGGTGCCTTCGCCCCCAGCGGTGCGGTG
           CCCAAGTGCTGAGTCACCGGGCGGGCCCGGGC----GCGGGGTGTGGGA
           ---------CCGAGGCCGCCGCGGG",
          "ATGGCGATGAAAGCGGTGTGCGTGCTGAAGGGCGACGGTCCGGTGCAGG
           GAACCATCCACTTCGAGCAGAAGGCAAGGCCCGGGGC------------
           ----------------------------------------GCGGGGCGC
           AGGCCGCGGTGACGCGGCGCACCTGTGCGGGAGCACGCCACGCCCCCG-
           CCACGGCCTGAG----------------------CCCG-----------
           -CTAAGTGCTGAGTCACC--GTGGCCTGGGGCAGGGGCTGGGCGCCGGG
           AAGCGAGGCCCGGGGC-GCCGC***")
seqs <- gsub("\\s", "", seqs) #remove whitespace from seqs
align <- msa(seqs, names=c("hg19", "panTro2", "mm9"))

translate.msa(align)
translate.msa(msa(c("NNATGGCCACG")))
translate.msa(msa(c("NNATGGCCACG")), frame=3)
translate.msa(msa(c("NNATGGCCACG", "AT--GGCCACG")))
translate.msa(msa(c("NNATGGCCACG", "AT--GGCCACG")), one.frame=FALSE)
translate.msa(msa(c("NNATGGCCACG", "AT--GGCCACG")), one.frame=FALSE, frame=c(3,1))

Example output

[1] "MATKAVCVLKGDGPVQGIINFEQKARAGTEACLRGRSHPLVPPRTFARSGS**RQ*LGAALVQRPVPARAARSVPSPPAVRCPSAESPGGPG**RGVG---PRPPR"
[2] "MATKAVCVLKGDGPVQGIINFEQKARAGTEACLRGRSYPLVPPRTFARSGS**RQ*LGAALVQRPVPARAARSVPSPPAVRCPSAESPGGPG**RGVG---PRPPR"
[3] "MAMKAVCVLKGDGPVQGTIHFEQKARPG*-----------------AGRRPR$RGAPVREHATPP*TA$*------*P----LSAES**GLGQGLGAGKRGPG*P*"
[1] "*WP"
[1] "MAT"
[1] "*WP" "**P"
[1] "*WP" "MAT"
[1] "MAT" "MAT"

rphast documentation built on May 1, 2019, 9:26 p.m.