read.corp.custom: Apply read.corp.custom() to all texts in kRp.corpus objects
In tm.plugin.koRpus: Full Corpus Support for the 'koRpus' Package

Description Usage Arguments Details Value Examples

This method calls read.corp.custom on all tagged text objects inside the given corpus object.

1
2
3

## S4 method for signature 'kRp.corpus'
read.corp.custom(corpus, caseSens = TRUE, log.base = 10,
      keep_dtm = FALSE, ...)

`corpus`	An object of class `kRp.corpus`.
`caseSens`	Logical. If `FALSE`, all tokens will be matched in their lower case form.
`log.base`	A numeric value defining the base of the logarithm used for inverse document frequency (idf). See `log` for details.
`keep_dtm`	Logical. If `TRUE` and `corpus` does not yet provide a `document term matrix`, the one generated during calculation will be added to the resulting object.
`...`	Options to pass through to the `read.corp.custom` method for objects of the class union `kRp.text`.

Since the analysis is based on a document term matrix, a pre-existing matrix as a feature of the corpus object will be used if it matches the case sensitivity setting. Otherwise a new matrix will be generated (but not replace the existing one). If no document term matrix is present yet, also one will be generated and can be kept as an additional feature of the resulting object.

An object of the same class as corpus.

# use readCorpus() to create an object of class kRp.corpus
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  myCorpus <- readCorpus(
    dir=file.path(
      path.package("tm.plugin.koRpus"), "examples", "corpus", "Edwards"
    ),
    hierarchy=list(
      Source=c(
        Wikipedia_prev="Wikipedia (old)",
        Wikipedia_new="Wikipedia (new)"
      )
    ),
    # use tokenize() so examples run without a TreeTagger installation
    tagger="tokenize",
    lang="en"
  )

  myCorpus <- read.corp.custom(myCorpus)
  corpusCorpFreq(myCorpus)
} else {}