library(knitr) desc <- suppressWarnings(readLines("DESCRIPTION")) regex <- "(^Version:\\s+)(\\d+\\.\\d+\\.\\d+)" loc <- grep(regex, desc) ver <- gsub(regex, "\\2", desc[loc]) verbadge <- sprintf('<a href="https://img.shields.io/badge/Version-%s-orange.svg"><img src="https://img.shields.io/badge/Version-%s-orange.svg" alt="Version"/></a></p>', ver, ver) ```` ```r knit_hooks$set(htmlcap = function(before, options, envir) { if(!before) { paste('<p class="caption"><b><em>',options$htmlcap,"</em></b></p>",sep="") } }) knitr::opts_knit$set(self.contained = TRUE, cache = FALSE) knitr::opts_chunk$set(fig.path = "tools/figure/")
formality utilizes the tagger package to conduct formality analysis. Heylighen (1999) and Heylighen & Dewaele (2002, 1999) have given the F-measure as a measure of how contextual or formal language is. Language is considered more formal when it contains much of the information directly in the text, whereas, contextual language relies on shared experiences to more efficiently dialogue with others.
The formality package's main function is also titled formality
and uses Heylighen & Dewaele's (1999) F-measure. The F-measure is defined formally as:
$$F = 50(((n_f - n_c)/N) + 1)$$
Where:
$$f = {noun, adjective, preposition, article}$$
$$c = {pronoun, verb, adverb, interjection}$$
$$N = n_f + n_c$$
This yields an F-measure between $0$ and $100$%, with completely contextualized language on the zero end and completely formal language on the $100$ end.
Please see the following references for more details about formality and the F-measure:
To download the development version of formality:
Download the zip ball or tar ball, decompress and run R CMD INSTALL
on it, or use the pacman package to install the development version:
if (!require("pacman")) install.packages("pacman") pacman::p_load_gh(c( "trinker/termco", "trinker/tagger", "trinker/formality" ))
You are welcome to: submit suggestions and bug-reports at: https://github.com/trinker/formality/issues send a pull request on: https://github.com/trinker/formality/ * compose a friendly e-mail to: tyler.rinker@gmail.com
The following examples demonstrate some of the functionality of formality.
library(formality) data(presidential_debates_2012)
formality
takes the text as text.var
and any number of grouping variables as grouping.var
. Here we use the presidential_debates_2012
data set and look at the formality of the people involved. Note that for smaller text Heylighen & Dewaele (2002) state:
At present, a sample would probably need to contain a few hundred words for the measure to be minimally reliable. For single sentences, the F-value should only be computed for purposes of illustration" (p. 24).
form1 <- with(presidential_debates_2012, formality(dialogue, person)) form1
This will take ~20 seconds because of the part of speech tagging that must be undertaken. The output can be reused as text.var
, cutting the time to a fraction of the first run.
with(presidential_debates_2012, formality(form1, list(time, person)))
The generic plot
function provides three views of the data:
**Note red dot in center is a warning of less than 300 words
plot(form1) ```` The `plot` function uses **gridExtra** to stitch the plots together, which is plotted imediately. However, the three subplots are actually returned as a list as seen below. ```r names(plot(form1, plot=FALSE))
Each of these is a ggplot2 object that can be further manipulated with various scales, facets, and annotations. I demonstrate some of this functionality in the plots below.
library(ggplot2) plot(form1, plot=FALSE)[[1]] + scale_size(range= c(8, 45)) + scale_x_continuous(limits = c(52, 63)) plot(form1, plot=FALSE)[[2]] + scale_fill_grey() plot(form1, plot=FALSE)[[2]] + scale_fill_brewer(palette = "Pastel1") + facet_grid(~type) plot(form1, plot=FALSE)[[3]] + scale_fill_gradient(high = "red", low="white") + ggtitle("Participant's Use of Parts of Speech") plot(form1, plot=FALSE)[[3]] + scale_fill_gradient2(midpoint=.12, high = "red", low="blue")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.