kRp.text_get-methods: Getter/setter methods for koRpus objects

Description Usage Arguments Details References Examples

Description

These methods should be used to get or set values of tagged text objects generated by koRpus functions like treetag or tokenize.

Usage

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
taggedText(obj, add.desc = FALSE, doc_id = FALSE)

## S4 method for signature 'kRp.text'
taggedText(obj, add.desc = FALSE, doc_id = FALSE)

taggedText(obj) <- value

## S4 replacement method for signature 'kRp.text'
taggedText(obj) <- value

doc_id(obj, ...)

## S4 method for signature 'kRp.text'
doc_id(obj, has_id = NULL)

hasFeature(obj, feature = NULL, ...)

## S4 method for signature 'kRp.text'
hasFeature(obj, feature = NULL)

hasFeature(obj, feature) <- value

## S4 replacement method for signature 'kRp.text'
hasFeature(obj, feature) <- value

feature(obj, feature, ...)

## S4 method for signature 'kRp.text'
feature(obj, feature, doc_id = NULL)

feature(obj, feature) <- value

## S4 replacement method for signature 'kRp.text'
feature(obj, feature) <- value

corpusReadability(obj, ...)

## S4 method for signature 'kRp.text'
corpusReadability(obj, doc_id = NULL)

corpusReadability(obj) <- value

## S4 replacement method for signature 'kRp.text'
corpusReadability(obj) <- value

corpusHyphen(obj, ...)

## S4 method for signature 'kRp.text'
corpusHyphen(obj, doc_id = NULL)

corpusHyphen(obj) <- value

## S4 replacement method for signature 'kRp.text'
corpusHyphen(obj) <- value

corpusLexDiv(obj, ...)

## S4 method for signature 'kRp.text'
corpusLexDiv(obj, doc_id = NULL)

corpusLexDiv(obj) <- value

## S4 replacement method for signature 'kRp.text'
corpusLexDiv(obj) <- value

corpusFreq(obj, ...)

## S4 method for signature 'kRp.text'
corpusFreq(obj)

corpusFreq(obj) <- value

## S4 replacement method for signature 'kRp.text'
corpusFreq(obj) <- value

corpusCorpFreq(obj, ...)

## S4 method for signature 'kRp.text'
corpusCorpFreq(obj)

corpusCorpFreq(obj) <- value

## S4 replacement method for signature 'kRp.text'
corpusCorpFreq(obj) <- value

corpusStopwords(obj, ...)

## S4 method for signature 'kRp.text'
corpusStopwords(obj)

corpusStopwords(obj) <- value

## S4 replacement method for signature 'kRp.text'
corpusStopwords(obj) <- value

## S4 method for signature 'kRp.text,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

## S4 replacement method for signature 'kRp.text,ANY,ANY,ANY'
x[i, j, ...] <- value

## S4 method for signature 'kRp.text'
x[[i, doc_id = NULL, ...]]

## S4 replacement method for signature 'kRp.text'
x[[i, doc_id = NULL, ...]] <- value

## S4 method for signature 'kRp.text'
describe(obj, doc_id = NULL, simplify = TRUE, ...)

## S4 replacement method for signature 'kRp.text'
describe(obj, doc_id = NULL, ...) <- value

## S4 method for signature 'kRp.text'
language(obj)

## S4 replacement method for signature 'kRp.text'
language(obj) <- value

diffText(obj, doc_id = NULL)

## S4 method for signature 'kRp.text'
diffText(obj, doc_id = NULL)

diffText(obj) <- value

## S4 replacement method for signature 'kRp.text'
diffText(obj) <- value

originalText(obj)

## S4 method for signature 'kRp.text'
originalText(obj)

is.taggedText(obj)

is.kRp.text(obj)

fixObject(obj, doc_id = NA)

## S4 method for signature 'kRp.text'
fixObject(obj, doc_id = NA)

tif_as_tokens_df(tokens)

## S4 method for signature 'kRp.text'
tif_as_tokens_df(tokens)

## S4 method for signature 'kRp.tagged'
fixObject(obj, doc_id = NA)

## S4 method for signature 'kRp.txt.freq'
fixObject(obj, doc_id = NA)

## S4 method for signature 'kRp.txt.trans'
fixObject(obj, doc_id = NA)

## S4 method for signature 'kRp.analysis'
fixObject(obj, doc_id = NA)

Arguments

obj

An arbitrary R object.

add.desc

Logical, determines whether the desc column should be re-written with descriptions for all POS tags.

doc_id

Logical (except for fixObject, feature, and [[/[[<-), if TRUE the doc_id column will be a factor with the respective value of the desc slot, i.\,e., the document ID will be preserved in the data.frame. If used with fixObject, can be a character string to set the document ID manually (the default NA will preserve existing values and not overwrite them). If used with feature or [[/[[<-, a character vector to limit the scope to one or more particular document IDs.

value

The new value to replace the current with.

...

Additional arguments for the generics.

has_id

A character vector with doc_ids to look for in the object. The return value is then a logical vector of the same length, indicating if the respective id was found or not.

feature

Character string naming the feature to look for. The return value is logical if a single feature name is given. If feature=NULL, a character vector is returned, naming all features found in the object.

x

An object of class kRp.text or kRp.hyphen.

i

Defines the row selector ([) or the name to match ([[).

j

Defines the column selector.

drop

Logical, whether the result should be coerced to the lowest possible dimension. See [ for more details.

simplify

Logical, if TRUE and the result is a list oft length one (i.e., just a single doc_id), returns the contents of the single list entry.

tokens

An object of class kRp.text.

Details

References

[1] Text Interchange Formats (https://github.com/ropensci/tif)

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# code is only run when the english language package can be loaded
if(require("koRpus.lang.en", quietly = TRUE)){
  sample_file <- file.path(
    path.package("koRpus"), "examples", "corpus", "Reality_Winner.txt"
  )
  tokenized.obj <- tokenize(
    txt=sample_file,
    lang="en"
  )

  doc_id(tokenized.obj)

  describe(tokenized.obj)

  language(tokenized.obj)

  taggedText(tokenized.obj)
  tokenized.obj[["token"]]
  tokenized.obj[1:3, "token"]

  tif_as_tokens_df(tokenized.obj)

  # example for originalText()
  tokenized.obj <- jumbleWords(tokenized.obj)
  # now compare the jumbled words to the original
  tokenized.obj[["token"]]
  originalText(tokenized.obj)[["token"]]
} else {}

koRpus documentation built on May 18, 2021, 1:13 a.m.