tokenize_sentences_ru: Tokenize character vector to a list of lists of sentences

View source: R/sentence_functions.R

tokenize_sentences_ruR Documentation

Tokenize character vector to a list of lists of sentences

Description

IMPORTANT: must be run first in order to create internal Python object 'sentences' and 'sentences_lower' to be accessed by 'get_sentence_filtered_doc_indices()' (and thereby also by 'create_sentence_filtered_df()' and 'explore_sentence_filtered_df()').

Usage

tokenize_sentences_ru(df, to_lower = TRUE)

Arguments

df

R data frame or corporaexplorerobject

to_lower

convert the final list of lists to lower case?

Value

internal Python object 'sentences' and 'sentences_lower'.


kgjerde/cxsentences documentation built on June 13, 2025, 2:38 p.m.