knitr::opts_chunk$set(widgetframe_self_contained = FALSE) knitr::opts_chunk$set(widgetframe_isolate_widgets = TRUE) knitr::opts_chunk$set(widgetframe_widgets_dir = 'widgets' )
required_packages <- c("vembedr", "htmltools", "devtools", "devtools", "widgetframe") for (pkg in required_packages) if (!pkg %in% rownames(installed.packages())) install.packages(pkg) if (!"annolite" %in% rownames(installed.packages())) devtools::install_github("PolMine/annolite")
the analysis of the context of words and query terms can serve as an analytical link between quantative counting operations and qualitative-interpretative approaches of content analysis. In the field of linguistics, this analysis of word contexts is called 'concordance analysis'. In the tradition of context analysis in social science this approach often is called Keyword-in-Context-Analysis (or 'KWIC'). Since it nicely conveys the central meaning of the approach, in polmineR the method is called kwic
. In the following, the term 'concordances' is also used for the qualitative examination of word contexts of query terms.
in advance, an important decision is how many words to the left and the right side of the query should be displayed. In linguistical analyses (e.g. in lexicographical approaches) a window of five words to the left and to the right is common. How much context do you need for your application? To satisfy specific disciplinary needs, more than five words might be necessary.
sometimes, it will not be enough to read and interpret just a small extract in the context window of a word. If necessary, the full text should be used for validation. This can be achieved by using the read()
method.
These slides use the polmineR
package and the UNGA
corpus. The installation is explained in a seperate set of slides.
Please note: The functionalities explained here are only available in polmineR version r as.package_version("0.7.10")
or above. Install the correct version of the package accordingly.
if (packageVersion("polmineR") < as.package_version("0.7.9.9010")) devtools::install_github("PolMine/polmineR", ref = "dev")
polmineR
package and activate the UNGA corpus.library(polmineR) use("UNGA")
kwic()
method: First steps {.smaller}options("polmineR.pagelength" = 4L)
kwic()
method can be applied to objects of the character
class (an entire corpus), as well as on partition
and context
objects. The query term can be defined by the argument query
. kwic("UNGA", query = "immigration")
vembedr::embed_youtube("F4UkFI0aolI", height = 400, width = 600)
options("polmineR.pagelength" = 3L)
polmineR
package the kwic()
method can be applied not only to corpora but also to partition
objects. The creation of partitions
is described in another set of slides. Here, we conduct the search above ('immigration') to a partition of debates from the year 2005. unga_2005 <- partition("UNGA", year = 2005) kwic(unga_2005, query = "immigration")
options("polmineR.pagelength" = 5L)
cqp
should be set to TRUE
(if it is not explicitly set to TRUE, polmineR checks internally if CQP syntax is used). kwic(unga_2005, query = '[pos = "J.*"] "immigration"', cqp = TRUE)
options("polmineR.pagelength" = 4L)
left
and right
for this.kwic("UNGA", query = "border", left = 15, right = 15)
left
and right
are not explicitly set, the values are used which are defined in the global options of polmineR. Which values are defined there can be determined as follows:getOption("polmineR.left") getOption("polmineR.right")
options(polmineR.left = 10) options(polmineR.right = 10)
options("polmineR.pagelength" = 5L) options("polmineR.left" = 5L) options("polmineR.right" = 5L)
s_attributes
argument.kwic(unga_2005, query = "immigration", s_attributes = "state_organization", verbose = FALSE)
options("polmineR.pagelength" = 3L)
c()
which here means 'combine'.kwic(unga_2005, query = "immigration", s_attributes = c("state_organization", "date"), verbose = FALSE)
s_attributes()
method:s_attributes(unga_2005)
positivelist
can be used to limit the results of a kwic()
analysis to those concordances in which a particular term (or a list of terms) occurs in addition to the query term. Using the highlight()
method, the terms can be highlighted. K <- kwic( "UNGA", query = "Islam", s_attributes = c("state_organization", "date"), positivelist = "terror" ) K <- highlight(K, yellow = "terror") K
kwic
result in a variable (here: K). Then, apply the read()
method to this object. The argument i
indicates the index of the concordance for which the full text is wanted.K <- kwic(unga_2005, query = "immigration") read(K, i = 1)
i <- 1L metadata <- c("speaker", "date", "state_organization") K <- kwic(unga_2005, query = "immigration", s_attributes = metadata) P <- partition( get_corpus(K), def = lapply(setNames(metadata, metadata), function(x) K@stat[[x]][i]), type = "plpr" ) data <- annolite::as.fulltextdata(P, headline = "Cornelie Sonntag-Wolgast (2005-01-21)") data$annotations <- data.frame( text = c("", "", ""), code = c("yellow", "lightgreen", "yellow"), annotation = c("", "", ""), id_left = c( min(K@cpos[hit_no == i][direction == -1][["cpos"]]), min(K@cpos[hit_no == i][direction == 0][["cpos"]]), min(K@cpos[hit_no == i][direction == 1][["cpos"]]) ), id_right = c( max(K@cpos[hit_no == i][direction == -1][["cpos"]]), min(K@cpos[hit_no == i][direction == 0][["cpos"]]), max(K@cpos[hit_no == i][direction == 1][["cpos"]]) ) ) W <- annolite::fulltext(data, dialog = NULL, box = TRUE, width = 1000, height = 400) Y <- widgetframe::frameWidget(W) Y
the analysis of concordances can be used in tandem with statistical analyses of cooccurrences: Cooccurrence analysis (cooccurrences()
function) provides indications of statistical remarkable word usage which can be analysed and interpreted qualitatively by the means of the Keyword-in-Context-Analysis.
A useful element to systematize the interpretations of concordances is to categorize them or arrange them by type. To this end, it can be helpful to export these concordances as Excel files (mail
method) which can then be used for further categorisation.
Note that working with concordances can be mainly understood as interpretatitve research which requires hermeneutical intuition. Who says what at which point in time might be relevant, but it is the search of patterns in language use which is most important for the method.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.