extractPCMBLOSUM: Generalized BLOSUM and PAM Matrix-Derived Descriptors

extractPCMBLOSUMR Documentation

Generalized BLOSUM and PAM Matrix-Derived Descriptors

Description

Generalized BLOSUM and PAM Matrix-Derived Descriptors

Usage

extractPCMBLOSUM(x, submat = "AABLOSUM62", k, lag, scale = TRUE, silent = TRUE)

Arguments

x

A character vector, as the input protein sequence.

submat

Substitution matrix for the 20 amino acids. Should be one of AABLOSUM45, AABLOSUM50, AABLOSUM62, AABLOSUM80, AABLOSUM100, AAPAM30, AAPAM40, AAPAM70, AAPAM120, AAPAM250. Default is 'AABLOSUM62'.

k

Integer. The number of selected scales (i.e. the first k scales) derived by the substitution matrix. This could be selected according to the printed relative importance values.

lag

The lag parameter. Must be less than the amino acids.

scale

Logical. Should we auto-scale the substitution matrix (submat) before doing eigen decomposition? Default is TRUE.

silent

Logical. Whether we print the relative importance of each scales (diagnal value of the eigen decomposition result matrix B) or not. Default is TRUE.

Details

This function calculates the generalized BLOSUM matrix-derived descriptors. For users' convenience, Rcpi provides the BLOSUM45, BLOSUM50, BLOSUM62, BLOSUM80, BLOSUM100, PAM30, PAM40, PAM70, PAM120, and PAM250 matrices for the 20 amino acids to select.

Value

A length lag * p^2 named vector, p is the number of scales selected.

References

Georgiev, A. G. (2009). Interpretable numerical descriptors of amino acid space. Journal of Computational Biology, 16(5), 703–723.

Examples

x = readFASTA(system.file('protseq/P00750.fasta', package = 'Rcpi'))[[1]]
blosum = extractPCMBLOSUM(x, submat = 'AABLOSUM62', k = 5, lag = 7, scale = TRUE, silent = FALSE)


nanxstats/Rcpi documentation built on Sept. 24, 2024, 9:36 a.m.