cTest-methods: Transform text into C-Test-like format

Description Usage Arguments Value Examples

Description

If you feed a tagged text object to this function, its text will be transformed into a format used for C-Tests:

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
cTest(obj, ...)

## S4 method for signature 'kRp.text'
cTest(
  obj,
  every = 2,
  min.length = 3,
  intact = c(start = 1, end = 1),
  replace.by = "_"
)

Arguments

obj

An object of class kRp.text.

...

Additional arguments to the method (as described in this document).

every

Integer numeric, setting the frequency of words to be manipulated. By default, every other word is being transformed.

min.length

Integer numeric, sets the minimum length of words to be considered (in letters).

intact

Named vector with the elements start and end. both must be integer values and define, which sentences are to be left untouched, counted in sentences from beginning and end of the text. The default is to ignore the first and last sentence.

replace.by

Character, will be used as the replacement for the removed word halves.

Value

An object of class kRp.text with the added feature diff.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )
  tokenized.obj <- cTest(tokenized.obj)
  pasteText(tokenized.obj)

  # diff stats are now part of the object
  hasFeature(tokenized.obj)
  diffText(tokenized.obj)
} else {}

koRpus documentation built on May 18, 2021, 1:13 a.m.