APR.Feature: Adjacent position relationship feature.

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/APR.Feature.R

Description

This feature was proposed by Li et al.(2012). In fact this is similar to the PN.FDTF encoding scheme (Huang et al., 2006). In this encoding, correlation between adjacent nucleotides are taken into account. For any nucleotide sequence with n nucleotides, every two consecutive positions between 1 and n, i.e., (1, 2), (2, 3)... (n-1, n) constitute an APR feature set. For each pair of positions, frequencies of 16 dinucleotides are first computed for both positive and negative dataset, and then the diffence matrix is obtained by substracting the 16*(n-1) dinucleotide frequency matrix of positive set from that of negative set. The difference matrix is then be used for encoding of nucleotide sequences. In this encoding procedure each sequence with n nucleotides can be encoded into a vector of (n-1) numeric observations.

Usage

1
APR.Feature(positive_class, negative_class, test_seq)

Arguments

positive_class

Nucleotide sequence dataset of positive class, must be an object of class DNAStringSet.

negative_class

Nucleotide sequence dataset of negative class, must be an object of class DNAStringSet.

test_seq

Nucleotide sequences to be encoded into numeric feature vectors, must be an object of class DNAStringSet.

Details

The class DNAStringSet can be obtained by using the function readDNAStringSet avialble in Biostrings package Bioconductor. Here, the sequences must be supplied in FASTA format. Both positive and negative datasets are required for this encoding scheme.

Value

A numeric matrix of order m*(n-1), where m is the number of sequences in test_seq and n is the length of sequence.

Author(s)

Prabina Kumar Meher, Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA

References

Li, J.L., Wang, L.F., Wang, H.Y., Bai, L.Y. and Yuan, Z.M. (2012). High-accuracy splice sites prediction based on sequence component and position features. Genetics and Molecular Research, 11(3): 3432-3451.

See Also

PN.Fdtf.Feature, WAM.Feature

Examples

1
2
3
4
5
6
7
8
9
data(droso)
positive <- droso$positive
negative <- droso$negative
test <- droso$test
pos <- positive[1:200]
neg <- negative[1:200]
tst <- test
enc <- APR.Feature(positive_class=pos, negative_class=neg, test_seq=tst)
enc

EncDNA documentation built on May 28, 2019, 9 a.m.

Related to APR.Feature in EncDNA...