get_lexical_coverage: Get Lexical Coverage with Specified Vocabulary
In chatRater: Rating and Evaluating Texts Using Large Language Models

View source: R/chatRater.R

get_lexical_coverage

R Documentation

Get Lexical Coverage with Specified Vocabulary

Description

Uses an LLM to obtain the lexical coverage (percentage) of a given text, taking into account a specified vocabulary size and the vocabulary test basis.

Usage

get_lexical_coverage(
  stimulus,
  vocab_size = 2000,
  vocab_test = "Vocabulary Levels Test",
  model = "gpt-3.5-turbo",
  api_key = "",
  top_p = 1,
  temp = 0
)

Arguments

`stimulus`	A character string representing the language material.
`vocab_size`	A numeric value indicating the size of the target vocabulary (e.g., 1000, 2000, 3000).
`vocab_test`	A character string specifying the vocabulary test used (e.g., "Vocabulary Levels Test", "LexTALE"). Users may provide any test name.
`model`	A character string specifying the LLM model (default "gpt-3.5-turbo").
`api_key`	API key as a character string.
`top_p`	Numeric value for probability mass (default 1).
`temp`	Numeric value for temperature (default 0).

Details

Default definition: "Lexical coverage is the proportion of words in a text that are included in a given vocabulary list. For this evaluation, assume a target vocabulary size of vocab_size words based on the vocab_test."

Value

A numeric value indicating the lexical coverage percentage.

Examples

## Not run: 
  coverage <- get_lexical_coverage("The quick brown fox jumps over the lazy dog",
                                   vocab_size = 2000,
                                   vocab_test = "Vocabulary Levels Test",
                                   model = "gpt-3.5-turbo",
                                   api_key = "your_api_key")
  cat("Lexical Coverage (%):", coverage, "\n")

## End(Not run)

chatRater documentation built on April 4, 2025, 1:03 a.m.