kmer: The Basic Kmer Descriptor

Description Usage Arguments Details Value Note Author(s) References See Also Examples

Description

The Basic Kmer Descriptor

Usage

1
kmer(x, k = 2, upto = FALSE, normalize = FALSE, reverse = FALSE)

Arguments

x

the input data, which should be a list or file type.

k

the k value of kmer, it should be an integer larger than 0.

upto

generate all the kmers: 1mer, 2mer, ..., kmer. The output feature vector is the combination of all these kmers. The default value of this parameter is False.

normalize

with this option, the final feature vector will be normalized based on the total occurrences of all kmers. Therefore, the elements in the feature vectors represent the frequencies of kmers. The default value of this parameter is False.

reverse

make reverse complements into a single feature, The default value of this parameter is False. if reverse is True, this method returns the reverse compliment kmer feature vector.

Details

This function calculates the basic kmer descriptor

Value

A vector

Note

if the parameters normalize and upto are both True, and then the feature vector is the combination of all these normalized kmers, e.g. the combination of normalized 1-kmer and normalized 2-kmer when k=2, normalize=True, upto=True.

Author(s)

Min-feng Zhu <wind2zhu@163.com>

References

Noble W S, Kuehn S, Thurman R, et al. Predicting the in vivo signature of human gene regulatory sequences. Bioinformatics, 2005, 21 Suppl 1, i338-343. Lee D, Karchin R, Beer M A. Discriminative prediction of mammalian enhancers from DNA sequence. Genome research. 2005, 21, 2167-2180.

See Also

See make_kmer_index

Examples

1
2
x = 'GACTGAACTGCACTTTGGTTTCATATTATTTGCTC'
kmer(x)

Example output

AA AC AG AT CA CC CG CT GA GC GG GT TA TC TG TT 
 1  3  0  3  2  0  0  4  2  2  1  1  2  2  4  7 

rDNAse documentation built on May 2, 2019, 4:16 a.m.