lma_meta: Calculate Text-Based Metastatistics

View source: R/lma_meta.R

lma_metaR Documentation

Calculate Text-Based Metastatistics

Description

Calculate simple descriptive statistics from text.

Usage

lma_meta(text)

Arguments

text

A character vector of texts.

Value

A data.frame:

  • characters: Total number of characters.

  • syllables: Total number of syllables, as estimated by split length of
    'a+[eu]*|e+a*|i+|o+[ui]*|u+|y+[aeiou]*' - 1.

  • words: Total number of words (raw word count).

  • unique_words: Number of unique words (binary word count).

  • clauses: Number of clauses, as marked by commas, colons, semicolons, dashes, or brackets within sentences.

  • sentences: Number of sentences, as marked by periods, question marks, exclamation points, or new line characters.

  • words_per_clause: Average number of words per clause.

  • words_per_sentence: Average number of words per sentence.

  • sixltr: Number of words 6 or more characters long.

  • characters_per_word: Average number of characters per word (characters / words).

  • syllables_per_word: Average number of syllables per word (syllables / words).

  • type_token_ratio: Ratio of unique to total words: unique_words / words.

  • reading_grade: Flesch-Kincaid grade level: .39 * words / sentences + 11.8 * syllables / words - 15.59.

  • numbers: Number of terms starting with numbers.

  • punct: Number of terms starting with non-alphanumeric characters.

  • periods: Number of periods.

  • commas: Number of commas.

  • qmarks: Number of question marks.

  • exclams: Number of exclamation points.

  • quotes: Number of quotation marks (single and double).

  • apostrophes: Number of apostrophes, defined as any modified letter apostrophe, or backtick or single straight or curly quote surrounded by letters.

  • brackets: Number of bracketing characters (including parentheses, and square, curly, and angle brackets).

  • orgmarks: Number of characters used for organization or structuring (including dashes, foreword slashes, colons, and semicolons).

Examples

text <- c(
  succinct = "It is here.",
  verbose = "Hear me now. I shall tell you about it. It is here. Do you hear?",
  couched = "I might be wrong, but it seems to me that it might be here.",
  bigwords = "Object located thither.",
  excited = "It's there! It's there! It's there!",
  drippy = "It's 'there', right? Not 'here'? 'there'? Are you Sure?",
  struggly = "It's here -- in that place where it is. Like... the 1st place (here)."
)
lma_meta(text)

lingmatch documentation built on Aug. 29, 2023, 1:09 a.m.