compute_lexdiv_stats: Compute lexical diversity from a dfm or tokens

compute_lexdiv_statsR Documentation

Compute lexical diversity from a dfm or tokens

Description

Internal functions used in textstat_lexdiv(), for computing lexical diversity measures on dfms or tokens objects

Usage

compute_lexdiv_dfm_stats(x, measure = NULL, log.base = 10)

compute_lexdiv_tokens_stats(
  x,
  measure = c("MATTR", "MSTTR"),
  MATTR_window,
  MSTTR_segment
)

Arguments

x

a dfm object

measure

a list of lexical diversity measures.

log.base

a numeric value defining the base of the logarithm (for measures using logs)

MATTR_window

a numeric value defining the size of the moving window for computation of the Moving-Average Type-Token Ratio (Covington & McFall, 2010)

MSTTR_segment

a numeric value defining the size of the each segment for the computation of the the Mean Segmental Type-Token Ratio (Johnson, 1944)

Details

compute_lexdiv_dfm_stats in an internal function that computes the lexical diversity measures from a dfm input.

compute_lexdiv_tokens_stats in an internal function that computes the lexical diversity measures from a dfm input.

Value

a data.frame with a document column containing the input document name, followed by columns with the lexical diversity statistic, in the order in which they were supplied as the measure argument.


quanteda/quanteda.textstat documentation built on Sept. 9, 2024, 7:41 p.m.