segment.optimizer: A function to optimize MSTTR segment sizes
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description Usage Arguments Details Value See Also Examples

This function calculates an optimized segment size for MSTTR.

1	segment.optimizer(txtlgth, segment = 100, range = 20, favour.min = TRUE)

`txtlgth`	Integer value, size of text in tokens.
`segment`	Integer value, start value of the segment size.
`range`	Integer value, range around `segment` to search for better fitting sizes.
`favour.min`	Logical, whether as a last ressort smaller or larger segment sizes should be prefered, if in doubt.

When calculating the mean segmental type-token ratio (MSTTR), tokens are divided into segments of a given size and analyzed. If at the end text is left over which won't fill another full segment, it is discarded, i.e. information is lost. For interpretation it is debatable which is worse: Dropping more or less actual token material, or variance in segment size between analyzed texts. If you'd prefer the latter, this function might prove helpful.

Starting with a given text length, segment size and range to investigate, segment.optimizer iterates through possible segment values. It returns the segment size which would drop the fewest tokens (zero, if you're lucky). Should more than one value fulfill this demand, the one nearest to the segment start value is taken. In cases, where still two values are equally far away from the start value, it depends on the setting of favour.min if the smaller or larger segment size is returned.

A numeric vector with two elements:

`seg`	The optimized segment size
`drop`	The number of tokens that would be dropped using this segment size

lex.div, MSTTR

1	segment.optimizer(2014, favour.min=FALSE)

Loading required package: sylly
For information on available language packages for 'koRpus', run

  available.koRpus.lang()

and see ?install.koRpus.lang()

 seg drop 
 106    0

koRpus documentation built on May 18, 2021, 1:13 a.m.

koRpus index

Package overview README.md Using the koRpus Package for Text Analysis

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

koRpus
Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

segment.optimizer: A function to optimize MSTTR segment sizes
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description

Usage

Arguments

Details

Value

See Also

Examples

Example output

Related to segment.optimizer in koRpus...

R Package Documentation

Browse R Packages

We want your feedback!

koRpus Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

segment.optimizer: A function to optimize MSTTR segment sizes In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description

Usage

Arguments

Details

Value

See Also

Examples

Example output

Related to segment.optimizer in koRpus...

R Package Documentation

Browse R Packages

We want your feedback!

koRpus
Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

segment.optimizer: A function to optimize MSTTR segment sizes
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity