View source: R/1_3_textEmbedReduce.R
textEmbedReduce | R Documentation |
Pre-trained dimension reduction (experimental)
textEmbedReduce(
embeddings,
n_dim = NULL,
scalar = "fb20/scalar.csv",
pca = "fb20/rpca_roberta_768_D_20.csv"
)
embeddings |
(list) Embedding(s) - including, tokens, texts and/or word_types. |
n_dim |
(numeric) Number of dimensions to reduce to. |
scalar |
(string or matrix) Name or URL to scalar for standardizing the embeddings. If a URL, the function first examines whether it has been downloaded before. The string should be to a csv file containing a matrix with the pca weights for matrix multiplication. For more information see reference below. |
pca |
(string or matrix) Name or URL to pca weights. If a URL, the function first examines whether it has been downlaoded before. The string should be to a csv file containing a matrix. For more information see reference below. |
To use this method please see and cite:
Ganesan, A. V., Matero, M., Ravula, A. R., Vu, H., & Schwartz, H. A. (2021, June).
Empirical evaluation of pre-trained transformers for human-level nlp: The role of sample size and dimensionality.
In Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting
(Vol. 2021, p. 4515).
NIH Public Access.
See Git-Hub Empirical-Evaluation
Returns embeddings with reduced number of dimensions.
textEmbed
## Not run:
embeddings <- textEmbedReduce(word_embeddings_4$texts)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.