lemmalex: Lemmalex dictionary

Description Usage Format Details Source

Description

Lemmalex is primarily based on the SUBTLEXus subtitle corpus (based on American subtitles with 51 million items in total) reduced to lemma using a copyrighted database (Francis and Kučera, 1982). The pronunciation is given by CMU Pronouncing Dictionary

Usage

1

Format

An object of class tbl_df (inherits from tbl, data.frame) with 17750 rows and 3 columns.

Details

Reference: Brysbaert, M., & New, B. (2009). Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior research methods, 41(4), 977-990.

Kučera, H., & Francis, W. N. (1967). Computational analysis of present-day American English. Brown university press.

CMU Pronouncing Dictionary: http://www.speech.cs.cmu.edu/cgi-bin/cmudict

@format A table with 20,293 rows and 3 variables:

Item

SUBTLEXus dictionary reduced to lemmas

Frequency

Number of times the item appeared in the SUBTLEXus corpus

Pronunciation

ARPAbet transcription according to CMU

...

Source

https://www.ugent.be/pp/experimentele-psychologie/en/research/documents/subtlexus


LexFindR documentation built on Oct. 29, 2021, 9:07 a.m.