In trinker/syllable: A Small Collection of Syllable Counting Functions

desc <- suppressWarnings(readLines("DESCRIPTION"))
regex <- "(^Version:\\s+)(\\d+\\.\\d+\\.\\d+)"
loc <- grep(regex, desc)
ver <- gsub(regex, "\\2", desc[loc])
verbadge <- sprintf('<a href="https://img.shields.io/badge/Version-%s-orange.svg"><img src="https://img.shields.io/badge/Version-%s-orange.svg" alt="Version"/></a></p>', ver, ver)
pacman::p_load(syllable, knitr)

knit_hooks$set(htmlcap = function(before, options, envir) {
  if(!before) {
    paste('<p class="caption"><b><em>',options$htmlcap,"</em></b></p>",sep="")
    }
    })
knitr::opts_knit$set(self.contained = TRUE, cache = FALSE)
knitr::opts_chunk$set(fig.path = "tools/figure/")

r verbadge

syllable is a small collection of tools for counting syllables and polysyllables. The tools rely primarily on data.table hash table lookups, resulting in fast syllable counting.

Main Functions

The main functions follow the format of action_object.

Actions

The following table outlines the actions. Example Output correspond to this string: "I like chicken sandwiches.".

| Action | Description | Returns | Example Output | |--------------|----------------------------|-----------------------|-----------------------------| | count | One integer per word | A vector per string | 1, 1, 2, 3 | | sum | Sum of syllable counts | An integer per string | 7 | | tally* | Sum of syllable attributes | An integer per string | pollysyllable tallies = 1 |

* The addition of _mono, _di, _poly _short (monosyllabic + disyllabic), or _both (short & pollysyllabic) to tally allows the user specify what syllable attribute is being tallied.

Objects

The following table outlines the objects acted upon:

| Object | Description | Example | |--------------|---------------------------------|--------------------------------| | string | A character string | "I like chicken sandwiches." | | vector* | A vector of character strings | c("I like it.", "Look out!") |

* The addition of _by to vector allows the user to aggregate by one or more vectors of grouping variables.

Putting It Together

The function count_vector will provide a vector of integer counts for each word in a string. For this reason count_vector will return a list of integer vector counts.

count_vector(c("I like it.", "Look out!"))

Each of the main functions is optimized to do its task efficiently. While one could use sum(count_vector(x)) and achieve the same results as sum_vector(x) it would be less efficient.

The available syllable functions that follow the format of action_object are:

p_load(pander, xtable, dplyr)

avaible_syllable_funs() %>%
    xtable() %>%
    print(type = 'html', include.colnames = FALSE, include.rownames = FALSE,
        html.table.attributes = '')

#matrix(c(sprintf("`%s`", vect), blanks), ncol=4) %>%
#    pandoc.table(format = "markdown", caption = "Available variable functions.")

Installation

To download the development version of syllable:

Download the zip ball or tar ball, decompress and run R CMD INSTALL on it, or use the pacman package to install the development version:

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh(
    'trinker/lexicon',
    'trinker/textclean',
    'trinker/textshape',
    'trinker/syllable'
)

Contact

You are welcome to: submit suggestions and bug-reports at: https://github.com/trinker/syllable/issues send a pull request on: https://github.com/trinker/syllable/ * compose a friendly e-mail to: tyler.rinker@gmail.com

Examples

The following examples demonstrate the functionality of a select sample of syllable functions.

Count Syllables In a String

Counts the number of syllables for each word in a string.

count_string("I like chicken and eggs for breakfast")

Count Syllables In a Vector of Strings

sents <- c("I like chicken.", "I want eggs benidict for breakfast.")
count_vector(sents)

Map(function(x, y) setNames(x, y),
   count_vector(sents),
   strsplit(gsub("[^a-z ]", "", tolower(sents)), "\\s+")
)

Sum the Syllables In a Vector of Strings by Grouping Variable(s)

dat <- data.frame(
   text = c("I like chicken.", "I want eggs benedict for breakfast.", "Really?"),
   group = c("A", "B", "A")
)
sum_vector_by(dat$text, dat$group)

Tally the Short/Poly-Syllabic Words by Group(s)

dat <- data.frame(
   text = c("I like excellent chicken.", "I want eggs benedict now.", "Really?"),
   group = c("A", "B", "A")
)
tally_both_vector_by(dat$text, dat$group)

with(presidential_debates_2012, tally_both_vector_by(dialogue, person))

Readability Word Statistics by Grouping Variable(s)

with(presidential_debates_2012, readability_word_stats_by(dialogue, list(person, time)))

Visualize Poly Syllable Distributions

if (!require("pacman")) install.packages("pacman")
pacman::p_load(dplyr, ggplot2, scales)

tally_both_vector(presidential_debates_2012$dialogue) %>%
    mutate(Duration = 1:length(poly)) %>%
    rowwise() %>%
    filter((short + poly) > 4) %>%
    mutate(
        short = short/(short+poly),
        poly = 1 - short,
        size = poly > .3
    ) %>%
    ggplot(aes(Duration, poly)) +
        geom_text(aes(label = Duration, size = size, color = size)) +
        coord_flip() +
        scale_size_manual(values = c(1.5, 2.5), guide=FALSE) +
        scale_color_manual(values = c("grey75", "black"), guide=FALSE) +
        scale_x_reverse() +
        scale_y_continuous(label = scales::percent) +
        ylab("Poly-syllabic") +
        xlab("Duration (sentences)") +
        theme_bw()

Visualize Poly Syllable Distributions by Group

if (!require("pacman")) install.packages("pacman")
pacman::p_load(dplyr, ggplot2, tidyr, scales)

with(presidential_debates_2012, tally_both_vector_by(dialogue, list(person, time))) %>%
    mutate(
        person_time = paste(person, time, sep = "-"),
        short = short/(short+poly),
        poly = 1 - short
    ) %>%
    arrange(poly) %>%
    mutate(person_time = factor(person_time, levels = person_time)) %>%
    gather(type, prop, c(short, poly)) %>%
    ggplot(aes(person_time, weight = prop, fill = type)) +
        geom_bar() +
        coord_flip() +        
        scale_y_continuous(label = scales::percent) +
        scale_fill_discrete(name="Syllable\nType") +
        xlab("Person & Time") +
        ylab("Usage") +
        theme_bw()

trinker/syllable documentation built on May 31, 2019, 10:42 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

trinker/syllable
A Small Collection of Syllable Counting Functions

In trinker/syllable: A Small Collection of Syllable Counting Functions

Main Functions

Actions

Objects

Putting It Together

Installation

Contact

Examples

Count Syllables In a String

Count Syllables In a Vector of Strings

Sum the Syllables In a Vector of Strings by Grouping Variable(s)

Tally the Short/Poly-Syllabic Words by Group(s)

Readability Word Statistics by Grouping Variable(s)

Visualize Poly Syllable Distributions

Visualize Poly Syllable Distributions by Group

R Package Documentation

Browse R Packages

We want your feedback!

trinker/syllable A Small Collection of Syllable Counting Functions

In trinker/syllable: A Small Collection of Syllable Counting Functions

Main Functions

Actions

Objects

Putting It Together

Installation

Contact

Examples

Count Syllables In a String

Count Syllables In a Vector of Strings

Sum the Syllables In a Vector of Strings by Grouping Variable(s)

Tally the Short/Poly-Syllabic Words by Group(s)

Readability Word Statistics by Grouping Variable(s)

Visualize Poly Syllable Distributions

Visualize Poly Syllable Distributions by Group

R Package Documentation

Browse R Packages

We want your feedback!

trinker/syllable
A Small Collection of Syllable Counting Functions