Description Usage Arguments Details Value Note Author(s) References Examples
View source: R/Predoss.Feature.R
In this encoding, not only the adjecent dependencies are considered, but also the association that exists among non-adjacent nucleotides. In MM1, PN.FDTF features, only the dependencies between adjacent nucleotides are taken into account. Though all possible pair-wise dependencies are first introduced by Meher et al. (2014) for predicting splice sites through probablistic approach, the same authors further used this association to encode the splice site dataset for prediction using machine learning classifiers (Meher et al., 2016).
1 | Predoss.Feature(positive_class, negative_class, test_seq)
|
positive_class |
Sequence dataset of the positive class, must be an object of class |
negative_class |
Sequence dataset of the negative class, must be an object of class |
test_seq |
Sequences to be encoded into numeric vectors, must be of an object of class |
This encoding approach will be helpful for transformation of nucleotide sequences into numeric feature vectors, which can subsequently be used as input in several supervised learning models for classification.
A numeric matrix of order m*n^{2}, where m is the number of sequences in test_seq
and n is the length of sequence.
Dimension of the feature space will increase geometrically with increase in the length of the sequence.
Prabina Kumar Meher, Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA
Meher, P.K., Sahu, T.K., Rao, A.R. and Wahi, S.D. (2014). A statistical approach for 5' splice site prediction using short sequence motifs and without encoding sequence data. BMC Bioinformatics, 15(1), 362.
Meher, P.K., Sahu, T.K., Rao, A.R. and Wahi, S.D. (2016). A computational approach for prediction of donor splice sites with improved accuracy. Journal of Theoretical Biology, 404: 285-294.
1 2 3 4 5 6 7 8 9 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.