kmers | R Documentation |
Generates genome kmers
kmers(
x,
k = 3L,
simplify = FALSE,
canonical = TRUE,
squeeze = FALSE,
anchor = TRUE,
clean_up = TRUE,
key_as_int = FALSE,
starting_index = 1L
)
x |
genome in string format |
k |
kmer length |
simplify |
returns a numeric vector of kmer counts, without associated string. This is useful to save memory, but should always be used with anchor = true. |
canonical |
only record canonical kmers (i.e., the lexicographically smaller of a kmer and its reverse complement) |
squeeze |
remove non-canonical kmers |
anchor |
includes unobserved kmers (with counts of 0). This is useful when generating a dense matrix where kmers of different genomes align. |
clean_up |
only include valid bases (ACTG) in kmer counts (excludes non-coding results such as N) |
key_as_int |
return kmer index (as "kmer_index") rather than the full kmer string. Useful for index-coded data structures such as libsvm. |
starting_index |
the starting index, only used if key_as_int = TRUE. |
list of kmer values, either as a list of a single vector (if simplify = TRUE), or as a named list containing "kmer_string" and "kmer_value".
kmers("ATCGCAGT")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.