compute_lexdiv_stats: Compute lexical diversity from a dfm or tokens

compute_lexdiv_statsR Documentation

Compute lexical diversity from a dfm or tokens

Description

Internal functions used in textstat_lexdiv(), for computing lexical diversity measures on dfms or tokens objects

Usage

compute_lexdiv_dfm_stats(x, measure = NULL, log.base = 10)

compute_lexdiv_tokens_stats(
  x,
  measure = c("MATTR", "MSTTR"),
  MATTR_window,
  MSTTR_segment
)

Arguments

x

a dfm object

measure

a list of lexical diversity measures.

log.base

a numeric value defining the base of the logarithm (for measures using logs)

MATTR_window

a numeric value defining the size of the moving window for computation of the Moving-Average Type-Token Ratio (Covington & McFall, 2010)

MSTTR_segment

a numeric value defining the size of the each segment for the computation of the the Mean Segmental Type-Token Ratio (Johnson, 1944)

Details

compute_lexdiv_dfm_stats in an internal function that computes the lexical diversity measures from a dfm input.

compute_lexdiv_tokens_stats in an internal function that computes the lexical diversity measures from a dfm input.

Value

a data.frame with a document column containing the input document name, followed by columns with the lexical diversity statistic, in the order in which they were supplied as the measure argument.


quanteda.textstats documentation built on Nov. 2, 2023, 5:07 p.m.