subset.tCorpus: S3 subset for tCorpus class
In corpustools: Managing, Querying and Analyzing Tokenized Text

subset.tCorpus

R Documentation

S3 subset for tCorpus class

Description

S3 subset for tCorpus class

Usage

## S3 method for class 'tCorpus'
subset(x, subset = NULL, subset_meta = NULL, window = NULL, ...)

Arguments

`x`	a tCorpus object
`subset`	logical expression indicating rows to keep in the tokens data.
`subset_meta`	logical expression indicating rows to keep in the document meta data.
`window`	If not NULL, an integer specifiying the window to be used to return the subset. For instance, if the subset contains token 10 in a document and window is 5, the subset will contain token 5 to 15. Naturally, this does not apply to subset_meta.
`...`	not used

Examples

## create tcorpus of 5 bush and obama docs
tc = create_tcorpus(sotu_texts[c(1:5,801:805),], doc_col='id')

## subset to keep only tokens where token_id <= 20 (i.e.first 20 tokens)
tcs1 = subset(tc, token_id < 20)
tcs1

## subset to keep only documents where president is Barack Obama
tcs2 = subset(tc, subset_meta = president == 'Barack Obama')
tcs2

corpustools documentation built on Aug. 8, 2025, 6:08 p.m.