SOCNumber: Sequence Order Coupling Number (SOCNumber)

Description Usage Arguments Value Note Examples

View source: R/SOCNumber.R

Description

This function uses dissimilarity matrices Grantham and Schneider to compute the dissimilarity between amino acid pairs. The distance between amino acid pairs is determined by d which varies between 1 to nlag. For each d, it computes the sum of the dissimilarities of all amino acid pairs. The sum shows the value of tau for a value d. The feature vector contains the values of taus for both matrices. Thus, the length of the feature vector is equal to nlag*2.

Usage

1
SOCNumber(seqs, nlag = 30, label = c())

Arguments

seqs

is a FASTA file with amino acid sequences. Each sequence starts with a '>' character. Also, seqs could be a string vector. Each element of the vector is a peptide/protein sequence.

nlag

is a numeric value which shows the maximum distance between two amino acids. Distances can be 1, 2, ..., or nlag. Defult is 30.

label

is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of each entry (i.e., sequence).

Value

It returns a feature matrix. The number of rows is equal to the number of sequences and the number of columns is (nlag*2). For each distance d, there are two values. One value for Granthman and another one for Schneider distance.

Note

When d=1, the pairs of amino acids have no gap and when d=2, there is one gap between the amino acid pairs in the sequence. It will repeat likewise for other values of d.

Examples

1
2
3
filePrs<-system.file("extdata/proteins.fasta",package="ftrCOOL")

mat<-SOCNumber(seqs=filePrs,nlag=25)

ftrCOOL documentation built on Nov. 30, 2021, 1:07 a.m.

Related to SOCNumber in ftrCOOL...