genome_to_libsvm | R Documentation |
This function converts a single genome to a libsvm file containing kmer counts. The libsvm format will be as follows:
label 1:count 2:count 3:count ...
Label is optional and defaults to 0. The kmer counts are indexed by the kmer index, which is the lexicographically sorted index of the kmer. Libsvm is a sparse format.
genome_to_libsvm(
x,
target_path,
label = as.character(c("0")),
k = 3L,
canonical = TRUE,
squeeze = FALSE
)
x |
genome in string format |
target_path |
path to store libsvm file (.txt) |
label |
libsvm label |
k |
kmer length |
canonical |
only record canonical kmers (i.e., the lexicographically smaller of a kmer and its reverse complement) |
squeeze |
remove non-canonical kmers |
boolean indicating success
For multiple genomes in a directory, processed in parallel, see genomes_to_kmer_libsvm()
For more details on libsvm format, see https://xgboost.readthedocs.io/en/stable/tutorials/input_format.html
temp_libsvm_path <- tempfile(fileext = ".txt")
genome_to_libsvm("ATCGCAGT", temp_libsvm_path)
readLines(temp_libsvm_path)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.