tokenize: Obtain token table from text

View source: R/parse.R

tokenizeR Documentation

Obtain token table from text

Description

utils::getParseData() is used to obtain a flat parse table from text.

Usage

tokenize(text)

Arguments

text

The text to parse.

Details

Apart from the columns provided by utils::getParseData(), the following columns are added:

  • A column "short" with the first five characters of "text".

  • A column "pos_id" for (positional id) which can be used for sorting (because "id" cannot be used in general). Note that the nth value of this column corresponds to n as long as no tokens are inserted.

  • A column "child" that contains nests.

Value

A flat parse table


styler documentation built on Aug. 29, 2023, 5:10 p.m.