kwic: Locate keywords-in-context

Description Usage Arguments Value Note Examples

View source: R/kwic.R

Description

For a text or a collection of texts (in a quanteda corpus object), return a list of a keyword supplied by the user in its immediate context, identifying the source text and the word index number within the source text. (Not the line number, since the text may or may not be segmented using end-of-line delimiters.)

Usage

1
2
3
4
kwic(x, pattern, window = 5, valuetype = c("glob", "regex", "fixed"),
  separator = " ", case_insensitive = TRUE, ...)

is.kwic(x)

Arguments

x

a character, corpus, or tokens object

pattern

a character vector, list of character vectors, dictionary, collocations, or dfm. See pattern for details.

window

the number of context words to be displayed around the keyword.

valuetype

the type of pattern matching: "glob" for "glob"-style wildcard expressions; "regex" for regular expressions; or "fixed" for exact matching. See valuetype for details.

separator

character to separate words in the output

case_insensitive

match without respect to case if TRUE

...

additional arguments passed to tokens, for applicable object types

Value

A kwic classed data.frame, with the document name (docname), the token index positions (from and to, which will be the same for single-word patterns, or a sequence equal in length to the number of elements for multi-word phrases), the context before (pre), the keyword in its original format (keyword, preserving case and attached punctuation), and the context after (post). The return object has its own print method, plus some special attributes that are hidden in the print view. If you want to turn this into a simple data.frame, simply wrap the result in data.frame.

Note

pattern will be a keyword pattern or phrase, possibly multiple patterns, that may include punctuation. If a pattern contains whitespace, it is best to wrap it in phrase to make this explicit. However if pattern is a collocations or dictionary object, then the collocations or multi-word dictionary keys will automatically be considered phrases where each whitespace-separated element matches a token in sequence.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
head(kwic(data_corpus_inaugural, "secure*", window = 3, valuetype = "glob"))
head(kwic(data_corpus_inaugural, "secur", window = 3, valuetype = "regex"))
head(kwic(data_corpus_inaugural, "security", window = 3, valuetype = "fixed"))

toks <- tokens(data_corpus_inaugural)
kwic(data_corpus_inaugural, phrase("war against"))
kwic(data_corpus_inaugural, phrase("war against"), valuetype = "regex")

mykwic <- kwic(data_corpus_inaugural, "provident*")
is.kwic(mykwic)
is.kwic("Not a kwic")

Example output

quanteda version 0.99
Using 2 of 1 threads for parallel computing

Attaching package: 'quanteda'

The following object is masked from 'package:utils':

    View

                                                                      
      [1797-Adams, 479]  welfare, and | secure  | the blessings of    
     [1797-Adams, 1513]  nations, and | secured | immortal glory with 
 [1805-Jefferson, 2368]   , and shall | secure  | to you the          
    [1817-Monroe, 1755] cherished. To | secure  | us against these    
    [1817-Monroe, 1815] defense as to | secure  | our cities and      
    [1817-Monroe, 3012]      I can to | secure  | economy and fidelity
                                                                              
 [1789-Washington, 1497] government for the | security | of their union       
       [1797-Adams, 479]       welfare, and |  secure  | the blessings of     
      [1797-Adams, 1513]       nations, and | secured  | immortal glory with  
  [1805-Jefferson, 2368]        , and shall |  secure  | to you the           
     [1813-Madison, 321]       seas and the | security | of an important      
     [1817-Monroe, 1610]      may form some | security | against these dangers
                                                        
 [1789-Washington, 1497] government for the | security |
     [1813-Madison, 321]       seas and the | security |
     [1817-Monroe, 1610]      may form some | security |
     [1817-Monroe, 3430]           and as a | security |
      [1825-Adams, 1371]      that the best | security |
      [1825-Adams, 1443]   that the firmest | security |
                        
 of their union         
 of an important        
 against these dangers  
 against foreign dangers
 for the beneficence    
 of peace is            
                                                                        
  [1857-Buchanan, 2913:2914] advantage of the fortune of | war against |
  [1901-McKinley, 2272:2273]         . We are not waging | war against |
  [1901-McKinley, 2287:2288]  portion of them are making | war against |
  [1901-McKinley, 2401:2402]    used when those who make | war against |
 [1933-Roosevelt, 1849:1850]   Executive power to wage a | war against |
                                  
 a sister republic, we            
 the inhabitants of the Philippine
 the United States. By            
 us shall make it no              
 the emergency, as great          
                                                               
 [1801-Jefferson, 1271:1272] domestic concerns and the surest |
        [1845-Polk, 690:691] domestic concerns and the surest |
  [1857-Buchanan, 2913:2914]      advantage of the fortune of |
  [1901-McKinley, 2272:2273]              . We are not waging |
  [1901-McKinley, 2287:2288]       portion of them are making |
  [1901-McKinley, 2401:2402]         used when those who make |
 [1933-Roosevelt, 1849:1850]        Executive power to wage a |
      [1977-Carter, 927:928]            and we will fight our |
                                                               
 bulwarks against | antirepublican tendencies; the preservation
 bulwark against  | antirepublican tendencies," and            
   war against    | a sister republic, we                      
   war against    | the inhabitants of the Philippine          
   war against    | the United States. By                      
   war against    | us shall make it no                        
   war against    | the emergency, as great                    
   wars against   | poverty, ignorance, and                    
[1] TRUE
[1] FALSE

quanteda documentation built on Nov. 20, 2018, 1:04 a.m.