nsyllable: Count syllables in a text

Description Usage Arguments Value Examples

View source: R/nsyllable.R

Description

Returns a count of the number of syllables in texts. For English words, the syllable count is exact and looked up from the CMU pronunciation dictionary, from the default syllable dictionary data_int_syllables. For any word not in the dictionary, the syllable count is estimated by counting vowel clusters.

Usage

1
nsyllable(x, language = "en", syllable_dictionary = NULL, use.names = FALSE)

Arguments

x

character vector whose syllables will be counted. This will count all syllables in a character vector without regard to separating tokens, so it is recommended that x be individual terms.

language

specify the language for syllable counts by ISO 639-1 code. The default is English, using the data object data_syllables_en, an English pronunciation dictionary from CMU.

syllable_dictionary

optional named integer vector of syllable counts where the names are lower case tokens. This can be used to override the language setting, when set to NULL (the default). If a syllable dictionary is supplied, this will override the language argument.

use.names

logical; if TRUE, assign the tokens as the names of the syllable count vector

Value

an integer vector of the counts of the syllables in each element, named with the element if use.names = TRUE

Examples

1
2
3
# character
nsyllable(c("cat", "syllable", "supercalifragilisticexpialidocious",
            "Brexit", "Administration"), use.names = TRUE)

quanteda/nsyllable documentation built on Dec. 31, 2020, 2:11 a.m.