eval_multiple_choice: Evaluate DSM on Multiple Choice Task (wordspace)

eval.multiple.choiceR Documentation

Evaluate DSM on Multiple Choice Task (wordspace)

Description

Evaluates DSM on a multiple choice task by selecting the answer option closest to the target term in distributional space. A typical example is the TOEFL Synonym Task (Landauer & Dumais 1997).

Usage


eval.multiple.choice(task, M, dist.fnc = pair.distances, ...,
                     details = FALSE, format = NA, taskname = NA,
                     target.name = "target", correct.name = "correct",
                     distractor.name = "^distract") 

Arguments

task

a data frame listing the target word, the correct answer, and one or more additional choices (distractors) for each test item

M

a scored DSM matrix, passed to dist.fnc

dist.fnc

a callback function used to compute distances between term pairs (or similarity scores, which must be marked with an attribute similarity=TRUE). See “Details” below for further information.

...

any further arguments are passed to dist.fnc and can be used e.g. to select a distance measure

details

if TRUE, a detailed report with information on each task item is returned (see “Value” below for details)

format

if the task definition specifies POS-disambiguated lemmas in CWB/Penn format, they can automatically be transformed into some other notation conventions; see convert.lemma for details

taskname

optional row label for the short report (details=FALSE)

target.name

the name of the column of task containing the target word

correct.name

the name of the column of task containing the correct choice

distractor.name

a regular expression matching columns of task containing the distractors. The regular expression is matched with perl=TRUE.

Details

The callback function dist.fnc will be invoked with character vectors containing the components of the term pairs as first and second argument, the DSM matrix M as third argument, plus any additional arguments (...) passed to eval.multiple.choice. The return value must be a numeric vector of appropriate length. If one of the terms in a pair is not represented in the DSM, the corresponding distance value should be set to Inf (or -Inf in the case of similarity scores). In most cases, the default callback pair.distances is sufficient if used with suitable parameter settings.

For each task item, distances between the target word and the possible choices are computed. Then all choices are ranked according to their distances; in the case of a tie, the higher rank is assigned to both words. A task item counts as a TP (true positive, i.e. a successful answer by the DSM) if the correct choice is ranked in first place. Note that if it is tied with another choice, both will be assigned rank 2, so the item does not count as a TP.

If either the target word is missing from the DSM or none of the choices is found in the DSM, the result for this item is set to NA, which counts as a FP (false positive) in the accuracy computation.

With the default dist.fnc callback, additional arguments method and p can be used to select a distance measure (see dist.matrix for details). It is pointless to specify rank="fwd", as the neighbour ranks produce exactly the same candidate ranking as the distance values.

Value

The default short report (details=FALSE) is a data frame with a single row and the columns accuracy (percentage correct), TP (number of correct answers), FP (number of wrong answers) and missing (number of test items for which the distance between target and correct choice was not found in the DSM).

The detailed report (details=TRUE) is a data frame with one row for each task item and the following columns:

target

the target word (character)

correct

whether model's choice is correct (logical or NA)

best.choice

best choice according to the DSM (character)

best.dist

distance of best choice from target (numeric)

correct.choice

correct answer (numeric)

correct.rank

rank of correct answer among choices (integer)

correct.dist

distance of correct answer from target (numeric)

Author(s)

Stephanie Evert (https://purl.org/stephanie.evert)

References

Landauer, Thomas K. and Dumais, Susan T. (1997). A solution to Plato's problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review, 104(2), 211–240.

See Also

Suitable gold standard data sets in this package: TODO

Support functions: pair.distances, convert.lemma

Examples

## TODO

wordspace documentation built on Aug. 23, 2022, 1:06 a.m.