Description Usage Arguments Slots S4 Class kRp.tagged S4 Class kRp.txt.freq S4 Class kRp.txt.trans S4 Class kRp.analysis References
These classes are no longer used by the koRpus
package and will be removed in a later version.
They are kept here for the time being so you can still load old objects and convert them into new objects using the
fixObject
method.
These functions will be removed soon and should no longer ne used.
1 2 3 4 5 6 7 |
... |
Parameters to be passed to the replacement of the function |
lang
A character string, naming the language that is assumed for the tokenized text in this object.
desc
Descriptive statistics of the tagged text.
TT.res
Results of the called tokenizer and POS tagger. The data.frame usually has eleven columns:
doc_id
:Factor, optional document identifier.
token
:Character, the tokenized text.
tag
:Factor, POS tags for each token.
lemma
:Character, lemma for each token.
lttr
:Integer, number of letters.
wclass
:Factor, word class.
desc
:Factor, a short description of the POS tag.
stop
:Logical, TRUE
if token is a stopword.
stem
:Character, stemmed token.
idx
:Integer, index number of token in this document.
sntc
:Integer, number of sentence in this document.
This data.frame structure adheres to the "Text Interchange Formats" guidelines set out by rOpenSci[1].
freq.analysis
A list with information on the word frequencies of the analyzed text.
diff
A list with mostly atomic vectors, describing the amount of diffences between both text variants (percentage):
all.tokens
:Percentage of all tokens, including punctuation, that were altered.
words
:Percentage of altered words only.
all.chars
:Percentage of all characters, including punctuation, that were altered.
letters
:Percentage of altered letters in words only.
transfmt
:Character vector documenting the transformation(s) done to the tokens.
transfmt.equal
:Data frame documenting which token was changed in which transformational step. Only available if more than one transformation was done.
transfmt.normalize
:A list documenting steps of normalization that were done to the object, one element per transformation. Each entry holds the name of the method, the query parameters, and the effective replacement value.
lex.div
Information on lexical diversity
kRp.tagged
This was used for objects returned by treetag
or tokenize
.
It was replaced by kRp.text
.
kRp.txt.freq
This was used for objects returned by freq.analysis
.
It was replaced by kRp.text
.
kRp.txt.trans
This was used for objects returned by textTransform
,
clozeDelete
,
cTest
, and jumbleWords
.
It was replaced by kRp.text
.
kRp.analysis
This was used for objects returned by kRp.text.analysis
.
The function is also deprecated,
functionality can be replicated by combining treetag
,freq.analysis
and lex.div
.
[1] Text Interchange Formats (https://github.com/ropensci/tif)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.