nas: Calculate Normalized Association Score
In chainsawriot/sweater: Speedy Word Embedding Association Test and Extras Using R

View source: R/nas.R

nas	R Documentation

Calculate Normalized Association Score

Description

This functions quantifies the bias in a set of word embeddings by Caliskan et al (2017). In comparison to WEAT introduced in the same paper, this method is more suitable for continuous ground truth data. See Figure 1 and Figure 2 of the original paper. If possible, please use query() instead.

Usage

nas(w, S_words, A_words, B_words, verbose = FALSE)

Arguments

`w`	a numeric matrix of word embeddings, e.g. from `read_word2vec()`
`S_words`	a character vector of the first set of target words. In an example of studying gender stereotype, it can include occupations such as programmer, engineer, scientists...
`A_words`	a character vector of the first set of attribute words. In an example of studying gender stereotype, it can include words such as man, male, he, his.
`B_words`	a character vector of the second set of attribute words. In an example of studying gender stereotype, it can include words such as woman, female, she, her.
`verbose`	logical, whether to display information

Value

A list with class "nas" containing the following components:

⁠$P⁠ a vector of normalized association score for every word in S
⁠$raw⁠ a list of raw results used for calculating normalized association scores
⁠$S_words⁠ the input S_words
⁠$A_words⁠ the input A_words
⁠$B_words⁠ the input B_words

References

Caliskan, A., Bryson, J. J., & Narayanan, A. (2017). Semantics derived automatically from language corpora contain human-like biases. Science, 356(6334), 183-186. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1126/science.aal4230")}

chainsawriot/sweater documentation built on Feb. 2, 2025, 3:53 a.m.