codeTokens: Annotate a data frame of tokens with codes using Lucene-like...

Description Usage Arguments Value

Description

Annotate a data frame of tokens with codes using Lucene-like search queries

Usage

1
2
3
codeTokens(tokens, queries, text_var = "word", default.window = NA,
  indicator_filter = rep(T, nrow(tokens)), condition_once = FALSE,
  presorted = F)

Arguments

tokens

a data frame of tokens containing columns for document id (doc_id), text position (position) and text string (column name can be specified in text_var, defaults to 'word').

queries

a data frame containing the queries.

text_var

a character string giving the name of the term string column

condition_once

logical. If TRUE, then if an indicator satisfies its conditions once in an article, all indicators within that article are coded.

presorted

The data has to be sorted on order(doc_id, position). If this is already the case, presorted can be set to TRUE to save time (which is usefull when testing many individual queries for large tokenlists)

Value

the annotated tokens data frame


kasperwelbers/tokenlist documentation built on May 20, 2019, 7:39 a.m.