cTest-methods: Transform text into C-Test-like format
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description Usage Arguments Value Examples

If you feed a tagged text object to this function, its text will be transformed into a format used for C-Tests:

the first and last sentence will be left untouched (except if the start and stop values of the intact parameter are changed
of all other sentences, the second half of every 2nd word (or as specified by every) will be replaced by a line
words must have at least min.length characters, otherwise they are skipped
words an uneven number of characters will be replaced after the next character, i.e., a word with five characters will keep the first three and have the last two replaced

cTest(obj, ...)

## S4 method for signature 'kRp.text'
cTest(
  obj,
  every = 2,
  min.length = 3,
  intact = c(start = 1, end = 1),
  replace.by = "_"
)

`obj`	An object of class `kRp.text`.
`...`	Additional arguments to the method (as described in this document).
`every`	Integer numeric, setting the frequency of words to be manipulated. By default, every other word is being transformed.
`min.length`	Integer numeric, sets the minimum length of words to be considered (in letters).
`intact`	Named vector with the elements `start` and `end`. both must be integer values and define, which sentences are to be left untouched, counted in sentences from beginning and end of the text. The default is to ignore the first and last sentence.
`replace.by`	Character, will be used as the replacement for the removed word halves.

An object of class kRp.text with the added feature diff.

# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )
  tokenized.obj <- cTest(tokenized.obj)
  pasteText(tokenized.obj)

  # diff stats are now part of the object
  hasFeature(tokenized.obj)
  diffText(tokenized.obj)
} else {}

koRpus documentation built on May 18, 2021, 1:13 a.m.

koRpus index

Package overview README.md Using the koRpus Package for Text Analysis

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

koRpus
Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

cTest-methods: Transform text into C-Test-like format
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description

Usage

Arguments

Value

Examples

Related to cTest-methods in koRpus...

R Package Documentation

Browse R Packages

We want your feedback!

koRpus Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

cTest-methods: Transform text into C-Test-like format In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

Description

Usage

Arguments

Value

Examples

Related to cTest-methods in koRpus...

R Package Documentation

Browse R Packages

We want your feedback!

koRpus
Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity

cTest-methods: Transform text into C-Test-like format
In koRpus: Text Analysis with Emphasis on POS Tagging, Readability, and Lexical Diversity