Description Usage Arguments Details Value See Also Examples
Read data from a custom corpus into a valid object of class kRp.corp.freq.
1 2 3 4 5 6 7 8 9 10 | read.corp.custom(corpus, caseSens = TRUE, log.base = 10, ...)
## S4 method for signature 'kRp.text'
read.corp.custom(
corpus,
caseSens = TRUE,
log.base = 10,
dtm = docTermMatrix(obj = corpus, case.sens = caseSens),
as.feature = FALSE
)
|
corpus |
An object of class |
caseSens |
Logical. If |
log.base |
A numeric value defining the base of the logarithm used for inverse document frequency (idf). See
|
... |
Additional options for methods of the generic. |
dtm |
A document term matrix of the |
as.feature |
Logical,
whether the output should be just the analysis results or the input object with
the results added as a feature. Use |
The methods should enable you to perform a basic text corpus frequency analysis. That is,
not just to
import analysis results like LCC files,
but to import the corpus material itself. The resulting object
is of class kRp.corp.freq,
so it can be used for frequency analysis by
other functions and methods of this package.
An object of class kRp.corp.freq.
Depending on as.feature,
either an object of class kRp.corp.freq,
or an object of class kRp.text with the added feature corp_freq containing it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | # code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
sample_file <- file.path(
path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
)
# call read.corp.custom() on a tokenized text
tokenized.obj <- tokenize(
txt=sample_file,
lang="en"
)
# if you call read.corp.custom() without arguments,
# you will get its results directly
en_corp <- read.corp.custom(
tokenized.obj,
caseSens=FALSE
)
# alternatively, you can also store those results as a
# feature in the object itself
tokenized.obj <- read.corp.custom(
tokenized.obj,
caseSens=FALSE,
as.feature=TRUE
)
# results are now part of the object
hasFeature(tokenized.obj)
corpusCorpFreq(tokenized.obj)
} else {}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.