DisorderMat: Disorder-based Substitution Matrices.

DisorderMatR Documentation

Disorder-based Substitution Matrices.

Description

The Disorder40, Disorder60, and Disorder85 Matrices were developed and described in Brown et al. (2009).
In short: There are substitution scoring matrices used to align proteins or regions which experience intrinsic disorder. The matrices were calculated using pairwise sequence alignments of protein families which here identified from 287 experimentally confirmed Intrinsically Disordered Proteins (IDPs). The IDPs contained at least 30 sequential residues of intrinsic disorder and protein families were found using BLAST.
There was not a comprehensive comparison to other frequently used substitution matrices (like BLOSUM and PAM) in terms of improving IDP sequence alignments. The authors note that the purpose of these matrices were made to compare evolutionary characteristics of disordered and ordered proteins. Please see the source material for additional information.

Trivedi and Nagarajaram (2019) compared EDSSMat62 against all three Disordered Matrices. Disorder40 and Disorder85 attain lower E-values for highly disordered proteins, on average, when compared to EDSSMat62. EDSSMat62 attained lower E-values when compared to Disorder60 for aligning highly disordered proteins. EDSSMat62 preforms better than all three Disorder matrices for IDPs enriched in ordered regions. Please see the referenced paper, specifically Supplementary Figures S18-20, for additional information and original comparison.

Additionally, please cite the source article when using Disorder40, Disorder60, or Disorder85.

Usage

Disorder40

Disorder60

Disorder85

Format

All matrices are symmetric. 24 residues are represented:

  • Each of the standard 20 standard amino acids

  • Four ambiguous residues:

    • B: Asparagine or Aspartic Acid (Asx)

    • Z: Glutamine or Glutamic Acid (Glx)

    • X: Unspecified or unknown amino acid

    • *: Stop

An object of class matrix (inherits from array) with 24 rows and 24 columns.

An object of class matrix (inherits from array) with 24 rows and 24 columns.

An object of class matrix (inherits from array) with 24 rows and 24 columns.

Optimal Gap Parameters

As mentioned in the Description, the intended use of these matrices was not to improve sequence alignments. Therefore, no gap penalty values are provided.

It should also be noted that a more recent work, Trivedi and Nagarajaram (2019), determined optimal parameters based on the disordered content of query sequences, as reported in the paper's Supplementary Table S5.

Matrix Name Gap Open (LD) Gap Extension (LD) Gap Open (MD) Gap Extension (MD) Gap Open (HD) Gap Extension (HD)
Disorder40 -20 -1 -7 -1 -7 -1
Disorder60 -20 -1 -16 -1 -11 -2
Disorder85 -20 -1 -16 -1 -7 -2

Please see the referenced paper for additional information and original reporting. Additionally, please see EDSSMat.

Additional Reference

Trivedi, R., Nagarajaram, H.A. Amino acid substitution scoring matrices specific to intrinsically disordered regions in proteins. Sci Rep 9, 16380 (2019). https://doi.org/10.1038/s41598-019-52532-8

Source

Brown, C. J., Johnson, A. K., & Daughdrill, G. W. (2009). Comparing Models of Evolution for Ordered and Disordered Proteins. Molecular Biology and Evolution, 27(3), 609-621. doi:10.1093/molbev/msp277

See Also

Disordered Matrices Vignette within the idpr package and EDSSMat62

Other IDP-based Substitution Matrices: DUNMat, EDSSMat


wmm27/idpr documentation built on Jan. 12, 2023, 8:45 a.m.