Description Usage Arguments Details Value Note Author(s) References See Also Examples
View source: R/MN.Fdtf.Feature.R
In this encoding procedure, at first, frequency of each nucleotide at each position is computed for both positive and negative classes datasets. Then, the frequency matrix of the positive set is substracted from that of negative set. The sequences are then encoded into numeric vectors after passing them through this difference matrix. So, both positive and negative datasets are necessary for encoding of sequences. This concept was introduced by Huang et al. (2006), and was also used by Pashaei et al. (2016) to generate features for prediction of splice sites along with other features. This has similarity with Bayes kernel encoding (Zhang et al., 2006), where both frequency matrices are used for encoding instead of the difference matrix.
1 | MN.Fdtf.Feature(positive_class, negative_class, test_seq)
|
positive_class |
Sequence dataset of the positive class, must be an object of class |
negative_class |
Sequence dataset of the negative class, must be an object of class |
test_seq |
Sequences to be encoded into numeric vectors, must be an object of class |
For getting an object of class DNAStringSet
, the sequence dataset must be read in FASTA format through the function readDNAStringSet available in the Biostrings package of Bioconductor (https://bioconductor.org/packages/release/bioc/html/Biostrings.html ).
A numeric matrix of order m*n, where m is the number of sequences in test_seq
and n is the sequence length.
This feature does not take into consideration the dependencies among nucleotides in the sequence.
Prabina Kumar Meher, Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA
Zhang, Y., Chu, C., Chen, Y., Zha, H. and Ji, X. (2006). Splice site prediction using support vector machines with a Bayes kernel. Expert Systems with Applications, 30: 73-81.
Huang, J., Li, T., Chen, K. and Wu, J. (2006). An approach of encoding for prediction of splice sites using SVM. Biochimie, 88(7): 923-929.
Pashaei, E., Yilmaz, A., Ozen, M. and Aydin, N. (2016). Prediction of splice site using AdaBoost with a new sequence encoding approach. In Systems, Man, and Cybernetics (SMC), IEEE International Conference, pp 3853-3858.
WMM.Feature
, Bayes.Feature
, PN.Fdtf.Feature
1 2 3 4 5 6 7 8 9 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.