correct-methods: Methods to correct koRpus objects

Description Usage Arguments Details Value See Also Examples

Description

The method correct.tag can be used to alter objects of class kRp.text.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
correct.tag(
  obj,
  row,
  tag = NULL,
  lemma = NULL,
  check.token = NULL,
  quiet = TRUE
)

## S4 method for signature 'kRp.text'
correct.tag(
  obj,
  row,
  tag = NULL,
  lemma = NULL,
  check.token = NULL,
  quiet = TRUE
)

Arguments

obj

An object of class kRp.text.

row

Integer, the row number of the entry to be changed. Can be an integer vector to change several rows in one go.

tag

A character string with a valid POS tag to replace the current tag entry. If NULL (the default) the entry remains unchanged.

lemma

A character string naming the lemma to to replace the current lemma entry. If NULL (the default) the entry remains unchanged.

check.token

A character string naming the token you expect to be in this row. If not NULL, correct will stop with an error if this values don't match.

quiet

If FALSE, messages about all applied changes are shown.

Details

Although automatic POS tagging and lemmatization are remarkably accurate, the algorithms do ususally produce some errors. If you want to correct for these flaws, this method can be of help, because it might prevent you from introducing new errors. That is, it will do some sanitiy checks before the object is actually manipulated and returned.

correct.tag will read the lang slot from the given object and check whether the tag provided is actually valid. If so, it will not only change the tag field in the object, but also update wclass and desc accordingly.

If check.token is set it must also match token in the given row(s). Note that no check is done on the lemmata.

Value

An object of the same class as obj.

See Also

kRp.text, treetag, kRp.POS.tags.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )
  tokenized.obj <- correct.tag(tokenized.obj, row=6, tag="NN")
} else {}

koRpus documentation built on May 18, 2021, 1:13 a.m.