Dictionaries: Full dictionary codings

DictionariesR Documentation

Full dictionary codings

Description

Full dictionary codings, including the vector codings

Usage

Dictionaries

Format

A data frame with 13930 rows and 973 variables:

word

a word in one or more of the dictionaries

_dict

variables ending in _dict indicate if the word is (1) or not (0) in the dictionary. If accompanied by a _lo it is coding if the word is low & in the dictionary, and if accompanied by a _hi it is coding if the word is high & in the dictionary (i.e., it combines the _dict and _dir variables)

_dir

variables ending in _dir indicate if the word is high (1), neutral (0) or low (-1) in the dictionary; e.g., friendly is high for sociability; unfriendly is low. Coded as NA if word not in the corresponding dictionary

fasttext

variables starting in fasttext are the word embedding dimensions for Fasttext trained on 2 million word vectors trained with subword information on Common Crawl (https://fasttext.cc/docs/en/english-vectors.html)

Glove

variables starting in Glove are the word embedding dimensions for Glove trained on Common Crawl (840B tokens, 2.2M vocab, cased, 300d vectors; https://nlp.stanford.edu/projects/glove/) (https://fasttext.cc/docs/en/english-vectors.html)

Word2vec

variables starting in Word2vec are the word embedding dimensions for Word2vec trained Google News (https://code.google.com/archive/p/word2vec/)

USE

variables starting in W2v are the word embedding dimensions for Universal Sentence Encoder trained on Common Crawl (https://arxiv.org/abs/1803.11175)

...


gandalfnicolas/SADCAT documentation built on June 8, 2024, 6:26 a.m.