set_diamond | R Documentation |
diamond makedb
This function reads a file storing a specific sequence type, such as "cds", "protein", or
"dna" in a standard sequence file format such as "fasta", etc. and depending of the makedb
parameter either creates a diamond-able database, or returns the corresponding protein sequences
as data.table object for further DIAMOND2 searches.
set_diamond(
file,
seq_type = "cds",
format = "fasta",
makedb = FALSE,
delete_corrupt_cds = TRUE,
path = NULL,
makedb_type = "protein",
comp_cores = 1,
quiet = TRUE,
...
)
file |
a character string specifying the path to the file storing the sequences of interest. |
seq_type |
a character string specifying the sequence type stored in the input file.
Options are are: "cds", "protein", or "dna". In case of "cds", sequence are translated to protein sequences,
in case of "dna", cds prediction is performed on the corresponding sequences which subsequently are
translated to protein sequences. Default is |
format |
a character string specifying the file format used to store the genome, e.g. "fasta", "gbk". |
makedb |
TRUE or FALSE whether a database should be created or not ( |
delete_corrupt_cds |
a logical value indicating whether sequences with corrupt base triplets should be removed from the input |
path |
a character string specifying the path to the DIAMOND2 program (in case you don't use the default path). |
makedb_type |
a character string specifying the sequence type stored in the DIAMOND2 database
that is generated using 'diamond makedb'. Currently, the only option is "protein". Default is |
comp_cores |
a numeric value specifying the number of cores to be used for multicore 'diamond makedb' computations. |
quiet |
a logical value indicating whether |
... |
additional arguments that are used by the seqinr::read.fasta() function. |
A list storing two elements. The first element [[1]] corresponds to the data.table storing the gene ids in the first column and the corresponding dna (cds) sequence in the second column and the aminoacid sequence third column. The second list element [[2]] stores the name of the protein database that was created by 'diamond makedb'.
Jaruwatana Sodai Lotharukpong
Buchfink, B., Reuter, K., & Drost, H. G. (2021) "Sensitive protein alignments at tree-of-life scale using DIAMOND." Nature methods, 18(4), 366-368.
https://github.com/bbuchfink/diamond/wiki/3.-Command-line-options
diamond_best
, diamond_rec
, diamond
, set_blast
## Not run:
# running the set function to see an example output
head(set_diamond(file = system.file('seqs/ortho_thal_cds.fasta', package = 'orthologr'))[[1]] , 2)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.