DUNMat: A Substitution Matrix for Aligning Intrinsically Disordered...

DUNMatR Documentation

A Substitution Matrix for Aligning Intrinsically Disordered Proteins

Description

This matrix was developed and described in Radivojac et al. (2002).
The name "DUNMat" is taken from Trivedi and Nagarajaram (2019). This is to keep naming consistent and distinct from other matrices named "Disorder".
In short: This is a substitution scoring matrix used to align proteins or regions which experience intrinsic disorder. The scores for this matrix are derived from proteins that have long regions of disorder (LDR), defined in this paper as an intrinsically disordered region (IDR) of at least 40 sequential residues. 55 protein families with LDRs were used to generate the data. Direct comparisons were not made against BLOSUM or PAM matrices within the source paper due to differences in scaling, however, when ranking its performance, it preformed the best in aligning proteins with less than 50% sequence identity. Please see the source material, specifically, table 2, for additional information.

Trivedi and Nagarajaram (2019) compared EDSSMat62 and DUNMat and show that DUNMat, on average, attained smaller E-values in the dataset of IDPs enriched in ordered regions, while EDSSMat62 attained smaller E-values in sets of highly disordered IDPs. Please see the referenced paper, specifically Supplementary Figure S21, for additional information and original comparison.

Additionally, please cite the source article when using the "DUNMat" Matrix.

Usage

DUNMat

Format

A symmetrical matrix. 20x20 representing the 20 standard amino acids

Optimal Gap Parameters

These values were described in the source article and reported in Table 2. After the optimal parameters were determined, the authors further refined the gap costs. Therefore, it is recommended to use these parameters for any alignment utilizing this matrix. These were:

DUNMat Gap Open Gap Extension
Original Optimization -3 -0.5
Further Refinement -3.2 -0.1

It should also be noted that a more recent work, Trivedi and Nagarajaram (2019), determined optimal parameters based on the disordered content of query sequences, as reported in the paper's Supplementary Table S5.

Matrix Name Gap Open (LD) Gap Extension (LD) Gap Open (MD) Gap Extension (MD) Gap Open (HD) Gap Extension (HD)
DUNMat -6 -1 -6 -1 -16 -2

Please see the referenced paper for additional information and original reporting. Additionally, please see EDSSMat.

Additional Reference

Trivedi, R., Nagarajaram, H.A. Amino acid substitution scoring matrices specific to intrinsically disordered regions in proteins. Sci Rep 9, 16380 (2019). https://doi.org/10.1038/s41598-019-52532-8

Source

Radivojac, P., Obradovic, Z., Brown, C. J., & Dunker, A. K. (2001). Improving sequence alignments for intrinsically disordered proteins. In Biocomputing 2002 (pp. 589-600): World Scientific. https://doi.org/10.1142/9789812799623_0055

See Also

EDSSMat62 and the Disordered Matrices Vignette within idpr

Other IDP-based Substitution Matrices: DisorderMat, EDSSMat


wmm27/idpr documentation built on Jan. 12, 2023, 8:45 a.m.