Description Usage Arguments Value Note Examples
For a text or a collection of texts (in a quanteda corpus object), return a list of a keyword supplied by the user in its immediate context, identifying the source text and the word index number within the source text. (Not the line number, since the text may or may not be segmented using end-of-line delimiters.)
1 2 3 4 5 6 7 8 9 10 11 |
x |
a character, corpus, or tokens object |
pattern |
a character vector, list of character vectors, dictionary, or collocations object. See pattern for details. |
window |
the number of context words to be displayed around the keyword. |
valuetype |
the type of pattern matching: |
separator |
character to separate words in the output |
case_insensitive |
logical; if |
... |
additional arguments passed to tokens, for applicable object types |
A kwic
classed data.frame, with the document name
(docname
), the token index positions (from
and to
,
which will be the same for single-word patterns, or a sequence equal in
length to the number of elements for multi-word phrases), the context
before (pre
), the keyword in its original format (keyword
,
preserving case and attached punctuation), and the context after
(post
). The return object has its own print
method, plus
some special attributes that are hidden in the print view. If you want to
turn this into a simple data.frame, simply wrap the result in
data.frame
.
pattern
will be a keyword pattern or phrase, possibly multiple
patterns, that may include punctuation. If a pattern contains whitespace,
it is best to wrap it in phrase()
to make this explicit.
However if pattern
is a collocations
or dictionary object, then the collocations or multi-word dictionary
keys will automatically be considered phrases where each
whitespace-separated element matches a token in sequence.
1 2 3 4 5 6 7 8 9 10 11 12 | head(kwic(data_corpus_inaugural, pattern = "secure*", window = 3, valuetype = "glob"))
head(kwic(data_corpus_inaugural, pattern = "secur", window = 3, valuetype = "regex"))
head(kwic(data_corpus_inaugural, pattern = "security", window = 3, valuetype = "fixed"))
toks <- tokens(data_corpus_inaugural)
kwic(data_corpus_inaugural, pattern = phrase("war against"))
kwic(data_corpus_inaugural, pattern = phrase("war against"), valuetype = "regex")
kw <- kwic(data_corpus_inaugural, "provident*")
is.kwic(kw)
is.kwic("Not a kwic")
is.kwic(kw[, c("pre", "post")])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.