protr-package: Generating Various Numerical Representation Schemes for...

Description Details Note References


The protr package is a comprehensive toolkit for generating various numerical representation schemes of protein sequence. The descriptors are extensively utilized in bioinformatics and chemogenomics research. The commonly used descriptors include amino acid composition, autocorrelation, CTD, conjoint traid, quasi-sequence order, pseudo amino acid composition, and profile-based descriptors derived by Position-Specific Scoring Matrix (PSSM). The descriptors for proteochemometric (PCM) modeling include the scales-based descriptors derived by principal components analysis, factor analysis, multidimensional scaling, amino acid properties (AAindex), 20+ classes of 2D and 3D molecular descriptors (Topological, WHIM, VHSE, etc.), and BLOSUM/PAM matrix-derived descriptors. The protr package also integrates the function of parallelized similarity computation derived by pairwise protein sequence alignment and Gene Ontology (GO) semantic similarity measures.


Package: protr
Type: Package
License: BSD_3_clause


The package vignette can be opened with vignette('protr').

The web server for this package, ProtrWeb is located at:

Bug reports and feature requests should be sent to


Xiao, N., Cao, D.-S., Zhu, M.-F., and Xu, Q.-S. (2015). protr/ProtrWeb: R package and web server for generating various numerical representation schemes of protein sequences. Bioinformatics 31 (11), 1857–1859.

Search within the protr package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.