Description Usage Arguments Details Value Note Author(s) References Examples
View source: R/Sparse.Feature.R
In this encoding approach A, T, G and C are encoded as (1,1,1), (1,0,0), (0,1,0) and (0,0,1). This was introduced by Golam Bari et al. (2014). Besides, each nucleotide can also be encoded with four bits i.e., A as (1,0,0,0), T as (0,1,0,0), G as (0,0,1,0) and C as (0,0,0,1) as followed in Meher et al. (2016).
| 1 | Sparse.Feature(test_seq)
 | 
| test_seq | Sequence dataset to be encoded into numeric vector containing 0 and 1, must be an object of class  | 
Each sequence is encoded independently, without the need of positive and negative classes datasets.
A vector of length 4*n for sequence of n nucleotides long in test_seq.
For larger sequence length, high dimensional feature vector will be generated.
Prabina Kumar Meher, Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA
Bari, A.T.M.G., Reaz, M.R. and Jeong, B.S. (2014). Effective DNA encoding for splice site prediction using SVM. MATCH Commun. Math. Comput. Chem., 71: 241-258.
Meher, P.K., Sahu, T.K., Rao, A.R. and Wahi, S.D. (2016). A computational approach for prediction of donor splice sites with improved accuracy. Journal of Theoretical Biology, 404: 285-294.
| 1 2 3 4 5 | data(droso)
test <- droso$test
tst <- test
enc <- Sparse.Feature(test_seq=tst)
enc
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.