MM1.Feature: Transforming nucleotide sequences into numeric vectors using...
In EncDNA: Encoding of Nucleotide Sequences into Numeric Feature Vectors

Description Usage Arguments Details Value Author(s) References See Also Examples

The concept of sequence encoding using Markov model (1^{st} order) was introduced by Ho and Rajapakse (2005) for prediction of splice sites. However, this encoding scheme has been comprehensively used by Baten et al. (2006) for prediction of splice sites. In this encoding procedure, first order dependencies between nucleotides in nucleotide sequence are accounted. Only the postive class dataset is used for estimation of dependencies in terms of probabilities, which are then used for encoding.

1	MM1.Feature(positive_class, test_seq)

`positive_class`	Sequence dataset of the positive class, must be an object of class `DNAStringSet`.
`test_seq`	Sequences to be encoded into numeric vectors, must be an object of class `DNAStringSet`.

The FASTA sequences should be read into R using the function readDNAStringSet available in Biostrings package. This encoding is similar to PN.FDTF feature, as far as the dependency among nucleotides in a sequence is concerned. The only difference is the use of positive class only in stead of both positive and negative classes in PN.FDTF. This encoding approach has similarity with WAM features (Meher et al. 2016) in which the dinucleotide dependencies are considered.

A numeric matrix of order m*(n-1), where m is the number of sequences in test_seq and n is the length of sequence.

Prabina Kumar Meher, Indian Agricultural Statistics Research Institute, New Delhi-110012, INDIA

Rajapakse, J. and Ho, L.S. (2005). Markov encoding for detecting signals in genomic sequences. IEEE/ACM Trans Comput Biol Bioinf., 2(2): 131-142.
Baten, A., Chang, B., Halgamuge, S. and Li, J. (2006) Splice site identification using probabilistic parameters and SVM classification. BMC Bioinformatics, 7(Suppl 5): S15.
Meher, P.K., Sahu, T.K., Rao, A.R. and Wahi, S.D. (2016). Identification of donor splice sites using support vector machine: a computational approach based on positional, compositional and dependency features. Algorithms for Molecular Biology, 11(1), 16.

PN.Fdtf.Feature, WAM.Feature

data(droso)
positive <- droso$positive
test <- droso$test
pos <- positive[1:200]
tst <- test
enc <- MM1.Feature(positive_class=pos, test_seq=tst)
enc

EncDNA documentation built on May 28, 2019, 9 a.m.

EncDNA index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

EncDNA
Encoding of Nucleotide Sequences into Numeric Feature Vectors

MM1.Feature: Transforming nucleotide sequences into numeric vectors using...
In EncDNA: Encoding of Nucleotide Sequences into Numeric Feature Vectors

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to MM1.Feature in EncDNA...

R Package Documentation

Browse R Packages

We want your feedback!

EncDNA Encoding of Nucleotide Sequences into Numeric Feature Vectors

MM1.Feature: Transforming nucleotide sequences into numeric vectors using... In EncDNA: Encoding of Nucleotide Sequences into Numeric Feature Vectors

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to MM1.Feature in EncDNA...

R Package Documentation

Browse R Packages

We want your feedback!

EncDNA
Encoding of Nucleotide Sequences into Numeric Feature Vectors

MM1.Feature: Transforming nucleotide sequences into numeric vectors using...
In EncDNA: Encoding of Nucleotide Sequences into Numeric Feature Vectors