terms_pairs_test: Test Pairs of Terms to Evaluate Embedding Quality

Description Usage Format Details References

Description

An example dataset containing pairs of analogies of medical terms to evaluate the quality of embedddings of terms by functions analogy_task and synonym_task.

Usage

1

Format

A list with elements:

spec

Terms with relation: specialty – body part (e.g. "cardiologist" – "heart"); a list of two character vectors of length 2

person

Terms with relation: man – woman; a list of two character vectors of length 2

synonym

Synonym terms; a list of two character vector of length 1

Details

Assessing the quality of real-data embeddings was performed by 7 types of analogies, described in the paper Dobrakowski et al., 2019.

References

Dobrakowski, A., A. Mykowiecka, M. Marciniak, W. Jaworski, and P. Biecek 2019. Interpretable Segmentation of Medical Free-Text Records Based on Word Embeddings. arXiv preprint arXiv:1907.04152.


adamgdobrakowski/memr documentation built on Sept. 4, 2021, 3:45 a.m.