peptide_phychem_index | R Documentation |
This function applies numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids to DNA protein-coding or to aminoacid sequences. As results, DNA protein-coding or the aminoacid sequences are represented as numerical vectors which can be subject of further downstream statistical analysis and digital signal processing.
peptide_phychem_index(aa, ...)
## S4 method for signature 'character'
peptide_phychem_index(
aa,
acc = NULL,
aaindex = NA,
userindex = NULL,
alphabet = c("AA", "DNA"),
genetic.code = getGeneticCode("1"),
no.init.codon = FALSE,
if.fuzzy.codon = "error",
...
)
## S4 method for signature 'DNAStringSet_OR_DNAMultipleAlignment'
peptide_phychem_index(
aa,
acc = NULL,
aaindex = NA,
userindex = NULL,
alphabet = c("AA", "DNA"),
genetic.code = getGeneticCode("1"),
no.init.codon = FALSE,
if.fuzzy.codon = "error",
num.cores = 1L,
tasks = 0L,
verbose = FALSE,
...
)
aa |
A character string, a |
... |
Not in use. |
acc |
Accession id for a specified mutation or contact potential matrix. |
aaindex |
Database where the requested accession id is locate and from where the aminoacid indices can be obtained. The possible values are: "aaindex2" or "aaindex3". |
userindex |
User provided aminoacid indices. This can be a numerical vector or a matrix (20 x 20). If a numerical matrix is provided, then the aminoacid indices are computes as column averages. |
alphabet |
Whether the alphabet is from the 20 aminoacid (AA) or four (DNA)/RNA base alphabet. This would prevent mistakes, i.e., the strings "ACG" would be a base-triplet on the DNA alphabet or simply the amino acid sequence of alanine, cysteine, and glutamic acid. |
genetic.code , no.init.codon , if.fuzzy.codon |
The same as given in function translation. |
num.cores , tasks |
Parameters for parallel computation using package
|
verbose |
If TRUE, prints the function log to stdout. |
If a DNA sequence is given, then it is assumed that it is a DNA base-triplet sequence, i.e., the base sequence must be multiple of 3.
Errors can be originated if the given sequences carry letter which are not from the DNA or aminoacid alphabet.
Depending on the user specifications, a mutation or contact potential matrix, a list of available matrices (indices) ids or index names can be returned. More specifically:
Returns an aminoacid mutation matrix or a statistical protein contact potentials matrix.
Returns the specified aminoacid physicochemical indices.
Robersy Sanchez https://genomaths.com
## Let's create DNAStringSet-class object
base <- DNAStringSet(x = c( seq1 ='ACGTCATAAAGT',
seq2 = 'GTGTAATACAGT',
seq3 = 'TCCTCATAAGGT'))
## The stop condon 'TAA' yields NA
aa <- peptide_phychem_index(base, acc = "EISD840101")
aa
## Description of the physicochemical index
slot(aa, 'phychem')
## Get the aminoacid sequences. The stop codon 'TAA' is replaced by '*'.
slot(aa, 'seqs')
aa <- peptide_phychem_index(base, acc = "MIYS850103", aaindex = "aaindex3")
aa
## Description of the physicochemical index
slot(aa, 'phychem')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.