getTokenStream-method: Get Token Stream Based on Corpus Positions.

Description Usage Arguments

Description

Turn regions of a corpus defined by corpus positions into the original text.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
getTokenStream(.Object, ...)

## S4 method for signature 'numeric'
getTokenStream(.Object, corpus, pAttribute,
  encoding = NULL, collapse = NULL, beautify = TRUE, cpos = FALSE,
  cutoff = NULL)

## S4 method for signature 'matrix'
getTokenStream(.Object, ...)

## S4 method for signature 'character'
getTokenStream(.Object, left = NULL, right = NULL,
  ...)

## S4 method for signature 'partition'
getTokenStream(.Object, pAttribute, collapse = NULL,
  cpos = FALSE, ...)

## S4 method for signature 'Regions'
getTokenStream(.Object, pAttribute = "word", ...)

Arguments

.Object

an object of class matrix or partition

...

further arguments

corpus

the CWB corpus

pAttribute

the pAttribute to decode

encoding

encoding to use

collapse

character string length 1

beautify

logical, whether to adjust whitespace before and after interpunctation

cpos

logical, whether to return cpos as names of the tokens

cutoff

maximum number of tokens to be reconstructed

left

left corpus position

right

right corpus position


nrauscher/corpus documentation built on May 23, 2019, 9:34 p.m.