Mismatch_DNA: Mismatch_DNA (Mismatch_DNA)

Description Usage Arguments Value References Examples

View source: R/Mismatch_DNA.R

Description

This function also calculates the frequencies of all k-mers in the sequence but alows maximum m mismatch. m<k.

Usage

1
Mismatch_DNA(seqs, k = 3, m = 2, label = c())

Arguments

seqs

is a FASTA file containing nucleotide sequences. The sequences start with '>'. Also, seqs could be a string vector. Each element of the vector is a nucleotide sequence.

k

This parameter can be a number which shows kmer.

m

This parametr shows muximum number of mismatches.

label

is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of each entry (i.e., sequence).

Value

This function returns a feature matrix. The number of rows is equal to the number of sequences and the number of columns depends on the rng vector. For each value k in the vector, (4)^k columns are created in the matrix.

References

Liu, B., Gao, X. and Zhang, H. BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches. Nucleic Acids Res (2019).

Examples

1
2
fileLNC<-system.file("extdata/Athaliana_LNCRNA.fa",package="ftrCOOL")
mat<-Mismatch_DNA(seqs=fileLNC)

ftrCOOL documentation built on Nov. 30, 2021, 1:07 a.m.