Spanishdicts | R Documentation |
Full spanish dictionaries. Word embedding values are based on english data.
Spanishdicts
A data frame:
Spanish word, not stemmed but some preprocessing (e.g., no symbols, spaces, accents)
Stemmed version of Palabra
Words as obtained from the literature or Wordnet. No preprocessing
lower-case word values
lower-case word values, no spaces or symbols
lower-case word values, no spaces or symbols, lemmatized
lower-case word values, no spaces or symbols, lemmatized, with no ending Ss (not real words. These are the values averaged over in the final dictionaries)
variables ending in _dict indicate if the word is (1) or not (0) in the dictionary. If accompanied by a _lo it is coding if the word is low & in the dictionary, and if accompanied by a _hi it is coding if the word is high & in the dictionary (i.e., it combines the _dict and _dir variables)
variables ending in _dir indicate if the word is high (1), neutral (0) or low (-1) in the dictionary; e.g., friendly is high for sociability; unfriendly is low. Coded as NA if word not in the corresponding dictionary
variables starting in fasttext are the word embedding dimensions for Fasttext trained on 2 million word vectors trained with subword information on Common Crawl (https://fasttext.cc/docs/en/english-vectors.html)
variables starting in Glove are the word embedding dimensions for Glove trained on Common Crawl (840B tokens, 2.2M vocab, cased, 300d vectors; https://nlp.stanford.edu/projects/glove/) (https://fasttext.cc/docs/en/english-vectors.html)
variables starting in Word2vec are the word embedding dimensions for Word2vec trained Google News (https://code.google.com/archive/p/word2vec/)
...
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.