treetagger: Use TreeTagger for linguistic annotation.

Description Usage Arguments Details

Description

The argument param is a list passing the arguments lang, a character vector that is expected to be "de", "fr", "it", or "en". The argument tokenize is a logical value. If TRUE, the tokenizer included in the treetagger scripts used, if FALSE, the input is expected to be tokenized already.

Usage

1
2
treetagger(filename, sourceDir = NULL, targetDir = NULL, verbose = FALSE,
  param = list(lang = "de", tokenize = TRUE))

Arguments

filename

file to process, or a character vector (if sourceDir is NULL)

sourceDir

directory with files to be processed

targetDir

output directory, if NULL, the processed input will be returned

verbose

logical, defaults to TRUE

param

a list that needs to include the language to be used (defaults to 'de') and a logical vector tokenize whether the input needs to be tokenized before tagging

Details

Depending whether targetDir is defined (i.e. not NULL), output is written to the file, or a character vector is returned. If sourceDir is NULL, filename will serve as the input character string. It will be written to a temporary file for further processing.


PolMine/ctk documentation built on May 8, 2019, 3:20 a.m.