txt_count_words: Count the number of spaces occurring in text

Description Usage Arguments Value Examples

View source: R/utils.R

Description

The C++ doc2vec functionalities in this package assume words are either separated by a space or tab symbol and that each document contains less than 1000 words.
This function calculates how many words there are in each element of a character vector by counting the number of occurrences of the space or tab symbol.

Usage

1
txt_count_words(x, pattern = "[ \t]", ...)

Arguments

x

a character vector with text

pattern

a text pattern to count which might be contained in x. Defaults to either space or tab.

...

other arguments, passed on to gregexpr

Value

an integer vector of the same length as x indicating how many times the pattern is occurring in x

Examples

1
2
3
x <- c("Count me in.007", "this is a set  of words",
       "more\texamples tabs-and-spaces.only", NA)
txt_count_words(x)

doc2vec documentation built on March 28, 2021, 1:09 a.m.