searchQuery: Find tokens using a Lucene-like search query

Description Usage Arguments Value

View source: R/query.r

Description

Find tokens using a Lucene-like search query

Usage

1
2
3
4
5
6
searchQuery(tokens, indicator, condition = "", code = "",
  doc.col = getOption("doc.col", "doc_id"),
  position.col = getOption("position.col", "position"),
  word.col = getOption("word.col", "word"), default.window = NA,
  condition_once = FALSE, indicator_filter = rep(T, nrow(tokens)),
  presorted = F)

Arguments

tokens

a data frame of tokens containing columns for document id (doc_id), text position (position) and text string (column name can be specified in word.col, defaults to 'word').

doc.col

a character string giving the name of the document id column in the tokens data.frame

position.col

a character string giving the name of the word position column in the tokens data.frame

word.col

a character string giving the name of the word column in the tokens data.frame

condition_once

logical. If TRUE, then if an indicator satisfies its conditions once in an article, all indicators within that article are coded.

presorted

The data has to be sorted on order(doc_id, position). If this is already the case, presorted can be set to TRUE to save time (which is usefull when testing many individual queries for large tokenlists)

queries

a data frame containing the queries.

batchsize

This function is faster if multiple queries are searched together, but too many queries (with too many tokens) at once can eat up memory or crash R. Try lowering batchsize in case of issues.

Value

the annotated tokens data frame


kasperwelbers/tokenlist documentation built on May 20, 2019, 7:39 a.m.