EDSSMat | R Documentation |
The EDSSMat series of matrices were developed and described in
Trivedi and
Nagarajaram (2019).
In short: These are substitution scoring matrices
used to align proteins or regions which experience intrinsic disorder.
Alignment blocks, used to compute the matrix values, were composed of
predicted intrinsically disordered regions. When compared to other, more
frequently used substitution matrices (like BLOSUM and PAM), EDSSMat
had significantly smaller E-values when aligning regions of disorder.
Additionally, EDSSMat62 was shown to identify both close and distant
homologs of a specific IDP while other matrices could only identify some
close homologs. See the source article for additional information
and for comparisons to other matrices.
Additionally, please cite the source article when using any
EDSSMat matrix.
EDSSMat50 EDSSMat60 EDSSMat62 EDSSMat70 EDSSMat75 EDSSMat80 EDSSMat90
All matrices are symmetric. 24 residues are represented:
Each of the standard 20 standard amino acids
Four ambiguous residues:
B: Asparagine or Aspartic Acid (Asx)
Z: Glutamine or Glutamic Acid (Glx)
X: Unspecified or unknown amino acid
*: Stop
An object of class matrix
(inherits from array
) with 24 rows and 24 columns.
An object of class matrix
(inherits from array
) with 24 rows and 24 columns.
An object of class matrix
(inherits from array
) with 24 rows and 24 columns.
An object of class matrix
(inherits from array
) with 24 rows and 24 columns.
An object of class matrix
(inherits from array
) with 24 rows and 24 columns.
An object of class matrix
(inherits from array
) with 24 rows and 24 columns.
An object of class matrix
(inherits from array
) with 24 rows and 24 columns.
There are 7 reported EDSSMat matrices. Each vary depending on the percent
identity threshold used to cluster protein sequences.
EDSSMat50 clustered proteins with 50% identity or higher,
EDSSMat62 clustered proteins with 62% identity or higher, etc.
See Usage Section for available matrices
These values were described in the source article and reported in
Supplemental Table S5. Therefore, it is recommended to use these parameters
for any alignment utilizing the respective EDSS matrix. These were
determined for 3 categories: Proteins containing Less Disorder (LD),
defined as [0-20%] disorder, Moderate Disorder (MD), defined as (20-40%]
disorder, and High Disorder (HD), defined as (40-100%] disorder.
Please see the source article for additional information.
Matrix Name | Gap Open (LD) | Gap Extension (LD) | Gap Open (MD) | Gap Extension (MD) | Gap Open (HD) | Gap Extension (HD) |
EDSSMat60 | -7 | -1 | -6 | -2 | -14 | -3 |
EDSSMat62 | -8 | -1 | -5 | -2 | -19 | -2 |
EDSSMat70 | -7 | -1 | -5 | -2 | -19 | -2 |
EDSSMat75 | -8 | -1 | -5 | -2 | -19 | -2 |
EDSSMat80 | -7 | -1 | -5 | -2 | -15 | -3 |
EDSSMat90 | -7 | -1 | -5 | -2 | -19 | -2 |
Trivedi, R., Nagarajaram, H.A. Amino acid substitution scoring matrices specific to intrinsically disordered regions in proteins. Sci Rep 9, 16380 (2019). https://doi.org/10.1038/s41598-019-52532-8
Disordered Matrices Vignette within the idpr package
Other IDP-based Substitution Matrices:
DUNMat
,
DisorderMat
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.