prot_vec | R Documentation |
By using the word2vec model, amino acids are mapped to vectors of real numbers. Conceptually, it involves a mathematical embedding from a space with many dimensions per amino acid to a continuous vector space with a much lower dimension.
prot2vec(prot_seq, embedding_dim, embedding_matrix = NULL, ...) vec2prot(prot_vec, embedding_matrix)
prot_seq |
protein sequences |
prot_vec |
protein embedding vectors |
embedding_dim |
dimension of embedding vectors |
embedding_matrix |
embedding matrix (default: NULL) |
... |
arguments for "word2vec::word2vec" but for dim, min_count and split |
prot_seq |
protein sequences |
prot_vec |
protein embedding vectors |
embedding_matrix |
embedding matrix |
Dongmin Jung
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality. arXiv:1310.4546.
Chang, M. (2020). Artificial intelligence for drug development, precision medicine, and healthcare.
word2vec::word2vec, word2vec::word2vec_similarity
prot_seq <- example_PTEN[1:10] prot2vec_result <- prot2vec(prot_seq = prot_seq, embedding_dim = 8) vec2prot_result <- vec2prot(prot_vec = prot2vec_result$prot_vec, embedding_matrix = prot2vec_result$embedding_matrix)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.