desc <- suppressWarnings(readLines("DESCRIPTION")) regex <- "(^Version:\\s+)(\\d+\\.\\d+\\.\\d+)" loc <- grep(regex, desc) ver <- gsub(regex, "\\2", desc[loc]) verbadge <- sprintf('<a href="https://img.shields.io/badge/Version-%s-orange.svg"><img src="https://img.shields.io/badge/Version-%s-orange.svg" alt="Version"/></a></p>', ver, ver) pacman::p_load(syllable, knitr)
knit_hooks$set(htmlcap = function(before, options, envir) { if(!before) { paste('<p class="caption"><b><em>',options$htmlcap,"</em></b></p>",sep="") } }) knitr::opts_knit$set(self.contained = TRUE, cache = FALSE) knitr::opts_chunk$set(fig.path = "tools/figure/")
syllable is a small collection of tools for counting syllables and polysyllables. The tools rely primarily on data.table hash table lookups, resulting in fast syllable counting.
The main functions follow the format of action_object
.
The following table outlines the actions. Example Output correspond to this string: "I like chicken sandwiches."
.
| Action | Description | Returns | Example Output |
|--------------|----------------------------|-----------------------|-----------------------------|
| count
| One integer per word | A vector per string | 1, 1, 2, 3 |
| sum
| Sum of syllable counts | An integer per string | 7 |
| tally
* | Sum of syllable attributes | An integer per string | pollysyllable tallies = 1 |
* The addition of _mono
, _di
, _poly
_short
(monosyllabic + disyllabic), or _both
(short & pollysyllabic) to tally
allows the user specify what syllable attribute is being tallied.
The following table outlines the objects acted upon:
| Object | Description | Example |
|--------------|---------------------------------|--------------------------------|
| string
| A character string | "I like chicken sandwiches."
|
| vector
* | A vector of character strings | c("I like it.", "Look out!")
|
* The addition of _by
to vector
allows the user to aggregate by one or more vectors of grouping variables.
The function count_vector
will provide a vector of integer counts for each word in a string. For this reason count_vector
will return a list
of integer vector counts.
count_vector(c("I like it.", "Look out!"))
Each of the main functions is optimized to do its task efficiently. While one could use sum(count_vector(x))
and achieve the same results as sum_vector(x)
it would be less efficient.
The available syllable functions that follow the format of action_object
are:
p_load(pander, xtable, dplyr) avaible_syllable_funs() %>% xtable() %>% print(type = 'html', include.colnames = FALSE, include.rownames = FALSE, html.table.attributes = '') #matrix(c(sprintf("`%s`", vect), blanks), ncol=4) %>% # pandoc.table(format = "markdown", caption = "Available variable functions.")
To download the development version of syllable:
Download the zip ball or tar ball, decompress and run R CMD INSTALL
on it, or use the pacman package to install the development version:
if (!require("pacman")) install.packages("pacman") pacman::p_load_gh( 'trinker/lexicon', 'trinker/textclean', 'trinker/textshape', 'trinker/syllable' )
You are welcome to: submit suggestions and bug-reports at: https://github.com/trinker/syllable/issues send a pull request on: https://github.com/trinker/syllable/ * compose a friendly e-mail to: tyler.rinker@gmail.com
The following examples demonstrate the functionality of a select sample of syllable functions.
Counts the number of syllables for each word in a string.
count_string("I like chicken and eggs for breakfast")
sents <- c("I like chicken.", "I want eggs benidict for breakfast.") count_vector(sents) Map(function(x, y) setNames(x, y), count_vector(sents), strsplit(gsub("[^a-z ]", "", tolower(sents)), "\\s+") )
dat <- data.frame( text = c("I like chicken.", "I want eggs benedict for breakfast.", "Really?"), group = c("A", "B", "A") ) sum_vector_by(dat$text, dat$group)
dat <- data.frame( text = c("I like excellent chicken.", "I want eggs benedict now.", "Really?"), group = c("A", "B", "A") ) tally_both_vector_by(dat$text, dat$group) with(presidential_debates_2012, tally_both_vector_by(dialogue, person))
with(presidential_debates_2012, readability_word_stats_by(dialogue, list(person, time)))
if (!require("pacman")) install.packages("pacman") pacman::p_load(dplyr, ggplot2, scales) tally_both_vector(presidential_debates_2012$dialogue) %>% mutate(Duration = 1:length(poly)) %>% rowwise() %>% filter((short + poly) > 4) %>% mutate( short = short/(short+poly), poly = 1 - short, size = poly > .3 ) %>% ggplot(aes(Duration, poly)) + geom_text(aes(label = Duration, size = size, color = size)) + coord_flip() + scale_size_manual(values = c(1.5, 2.5), guide=FALSE) + scale_color_manual(values = c("grey75", "black"), guide=FALSE) + scale_x_reverse() + scale_y_continuous(label = scales::percent) + ylab("Poly-syllabic") + xlab("Duration (sentences)") + theme_bw()
if (!require("pacman")) install.packages("pacman") pacman::p_load(dplyr, ggplot2, tidyr, scales) with(presidential_debates_2012, tally_both_vector_by(dialogue, list(person, time))) %>% mutate( person_time = paste(person, time, sep = "-"), short = short/(short+poly), poly = 1 - short ) %>% arrange(poly) %>% mutate(person_time = factor(person_time, levels = person_time)) %>% gather(type, prop, c(short, poly)) %>% ggplot(aes(person_time, weight = prop, fill = type)) + geom_bar() + coord_flip() + scale_y_continuous(label = scales::percent) + scale_fill_discrete(name="Syllable\nType") + xlab("Person & Time") + ylab("Usage") + theme_bw()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.