amcat.gettokens: Get Tokens from AmCAT

Description Usage Arguments Value

View source: R/corpus.r

Description

Get Tokens (pos, lemma etc) from AmCAT

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
amcat.gettokens(
  conn,
  project = NULL,
  articleset = NULL,
  module = "elastic",
  filters = NULL,
  page_size = 1,
  sentence = NULL,
  only_cached = F,
  ...
)

Arguments

conn

the connection object from amcat.connect

project

id of the project containing the tokens

articleset

id of the articleset to get features from. If not specified, specify sentence for 'ad hoc' parsing

module

the NLP preprocessing module to get the tokens from

filters

Additional filters, ie c(pos1="V", pos1="A") to select only verbs and adjectives

page_size

the number of features (articles?) to include per call

sentence

a sentence (string) to be parsed if articleset id is not given

only_cached

if true, only get tokens that have already been preprocessed (recommended for large corpora!)

Value

A data frame of tokens


amcat/amcat-r documentation built on Dec. 26, 2021, 3:12 a.m.