| kwic-class | R Documentation | 
S4 class for organizing information for kwic/concordance output. A set of
standard generics (show, as.character, as.data.frame,
length, sample, subset) as well as indexing is implemented to process
kwic class objects (see 'Usage'). See section 'Details' for the
enrich, view and knit_print methods.
## S4 method for signature 'kwic'
get_corpus(x)
## S4 method for signature 'kwic'
count(.Object, p_attribute = "word")
## S4 method for signature 'kwic'
as.DocumentTermMatrix(x, p_attribute, verbose = TRUE, ...)
## S4 method for signature 'kwic'
as.TermDocumentMatrix(x, p_attribute, verbose = TRUE, ...)
## S4 method for signature 'kwic'
show(object)
## S4 method for signature 'kwic'
knit_print(x, options = knitr::opts_chunk)
## S4 method for signature 'kwic'
as.character(x, fmt = "<i>%s</i>")
## S4 method for signature 'kwic,ANY,ANY,ANY'
x[i]
## S4 method for signature 'kwic'
subset(x, ...)
## S4 method for signature 'kwic'
as.data.frame(x)
## S4 method for signature 'kwic'
length(x)
## S4 method for signature 'kwic'
sample(x, size)
## S4 method for signature 'kwic_bundle'
merge(x)
## S4 method for signature 'kwic'
enrich(.Object, s_attributes = NULL, extra = NULL, table = FALSE, ...)
## S4 method for signature 'kwic'
format(
  x,
  node_color = "blue",
  align = TRUE,
  extra_color = "grey",
  lineview = getOption("polmineR.lineview")
)
## S4 method for signature 'kwic'
view(.Object)
| x | A  | 
| .Object | A  | 
| p_attribute | A length-one  | 
| verbose | A  | 
| ... | Used for backwards compatibility. | 
| object | A  | 
| options | Chunk options. | 
| fmt | A format string passed into  | 
| i | Single integer value, the kwic line for which the fulltext shall be inspected. | 
| size | An  | 
| s_attributes | Character vector of s-attributes with metainformation. | 
| extra | An  | 
| table | Logical, whether to turn cpos  | 
| node_color | If not  | 
| align | A  | 
| extra_color | If extra context has been generated using  | 
| lineview | A  | 
Applying the count-method on a kwic object will return
a count object with the evaluation of the left and right context of
the match.
The knit_print() method will be called by knitr to render
kwic objects as a DataTable  htmlwidget when rendering a R
Markdown document as html. It will usually be necessary to explicitly state
"render = knit_print" in the chunk options. The option
polmineR.pagelength controls the number of lines displayed in the
resulting htmlwidget. Note that including htmlwidgets in html documents
requires that pandoc is installed. To avoid an error, a formatted
data.table is returned by knit_print() if pandoc is not
available.
The as.character-method will return a list of character vectors,
concatenating the columns "left", "node" and "right" of the data.table in
the stat-slot of the input kwic-class object. Optionally, the node can
be formatted using a format string that is passed into sprintf.
The subset-method will apply subset to the table in the slot
stat, e.g. for filtering query results based on metadata (i.e.
s-attributes) that need to be present.
The enrich method is used to generate the actual output for
the kwic method. If param table is TRUE, corpus positions
will be turned into a data.frame with the concordance lines. If
param s_attributes is a character vector with s-attributes, the
respective s-attributes will be added as columns to the table with
concordance lines.
The format-method will return a data.table that can
serve as input for rendering a htmlwidget, for instance using
DT::datatable or rhandsontable::rhandsontable. It will
include html tags, so ensure that the rendering engine does not obfuscate
the html.
metadataA character vector with s-attributes of the metadata
that are to be displayed.
p_attributeThe p-attribute for which the context has been generated.
leftAn integer value, words to the left of the query match.
rightAn integer value, words to the right of the query match.
corpusLength-one character vector, the CWB corpus.
cposA data.table with the columns "match_id", "cpos", "position",
"word_id", "word" and "direction".
statA data.table, a table with columns "left", "node",
"right", and metadata, if the object has been enriched.
encodingA length-one character vector with the encoding of the
corpus.
nameA length-one character vector naming the object.
annotation_colsA character vector designating the columns of
the data.table in the slot table that are annotations.
The constructor for generating kwic objects is the
kwic method.
use("polmineR")
K <- kwic("GERMAPARLMINI", "Integration")
get_corpus(K)
length(K)
K_min <- K[1]
K_min <- K[1:5]
# using kwic_bundle class
queries <- c("oil", "prices", "barrel")
li <- lapply(queries, function(q) kwic("REUTERS", query = q))
kb <- as.bundle(li)
# use count-method on kwic object
coi <- kwic("REUTERS", query = "oil") %>%
  count(p_attribute = "word")
# features vs cooccurrences-method (identical results)
ref <- count("REUTERS", p_attribute = "word") %>%
  subset(word != "oil")
slot(ref, "size") <- slot(ref, "size") - count("REUTERS", "oil")[["count"]]
y_features <- features(coi, ref, method = "ll", included = TRUE)
y_cooc <- cooccurrences("REUTERS", query = "oil")
# extract node and left and right context as character vectors
oil <- kwic("REUTERS", query = "oil")
as.character(oil, fmt = NULL)
as.character(oil) # node wrapped into <i> tag by default
as.character(oil, fmt = "<b>%s</b>")
# subsetting kwic objects
oil <- corpus("REUTERS") %>%
  kwic(query = "oil") %>%
  subset(grepl("prices", right))
saudi_arabia <- corpus("REUTERS") %>%
  kwic(query = "Arabia") %>%
  subset(grepl("Saudi", left))
int_spd <- corpus("GERMAPARLMINI") %>%
  kwic(query = "Integration") %>%
  enrich(s_attribute = "party") %>%
  subset(grepl("SPD", party))
# turn kwic object into data.frame with html tags
int <- corpus("GERMAPARLMINI") %>%
  kwic(query = "Integration")
as.data.frame(int) # Without further metadata
enrich(int, s_attributes = c("date", "speaker", "party")) %>%
  as.data.frame()
  
# merge bundle of kwic objects into one kwic
reuters <- corpus("REUTERS")
queries <- c('"Saudi" "Arabia"', "oil", '"barrel.*"')
comb <- lapply(queries, function(qu) kwic(reuters, query = qu)) %>%
  as.bundle() %>%
  merge()
 
# enrich kwic object
i <- corpus("GERMAPARLMINI") %>%
  kwic(query = "Integration") %>%
  enrich(s_attributes = c("date", "speaker", "party"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.