knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "80%"
)

The speech package

Nicolás Schmidt, Diego Luján, Juan Andrés Moraes

CRAN_Status_Badge r badger::badge_devel("Nicolas-Schmidt/speech", "orange") R build status R-CMD-check CRAN RStudio mirror downloads CRAN RStudio mirror downloads

Description

Converts the floor speeches of Uruguayan legislators, extracted from the parliamentary minutes, to tidy data.frame where each observation is the intervention of a single legislator.

Installation

# Install speech from CRAN
install.packages("speech")

# The development version from GitHub:
if (!require("remotes")) install.packages("remotes")
remotes::install_github("Nicolas-Schmidt/speech")

Data generation process

1 - Floor speeches

2 - Data extraction

3 - First construction of the data set: speech::speech_build()

4 - Final data set: speech::speech_build(., compiler = TRUE)

Example

You can see more complex examples in the following link.

library(speech)
url <- "https://parlamento.gub.uy/documentosyleyes/documentos/diarios-de-sesion/6084/IMG"
text <- speech::speech_build(file = url)
text


speech_check(text, initial = c("A", "M"))


text <- speech::speech_build(file = url, compiler = TRUE)
text


text$word <- speech_word_count(text$speech)

dplyr::glimpse(text)

Possible application

library(magrittr)

minchar <- function(string, min = 3){
    string <- stringr::str_remove_all(string, "[[:punct:]]")
    string <- unlist(strsplit(string, " "))
    string[nchar(string) > min]
}

text$speech %>% 
    minchar(., min = 4) %>%  
    quanteda::corpus() %>% 
    quanteda::dfm(remove = c("señor", "señora")) %>% 
    quanteda.textplots::textplot_wordcloud(color = rev(RColorBrewer::brewer.pal(10, "RdBu")))
library(ggplot2)

text$speech %>% 
    minchar(., min = 4) %>%  
    tibble::enframe() %>% 
    tidytext::unnest_tokens(word, value) %>%
    dplyr::count(word, sort = TRUE) %>%
    dplyr::mutate(word = stats::reorder(word, n)) %>%
    dplyr::filter(!stringr::str_detect(word, "^señor")) %>% 
    .[1:40,] %>% 
    ggplot(aes(word, n)) +
        geom_col(col = "black", fill = "#00A08A", width = .7) +
        labs(x = "", y = "") +
        coord_flip() +
        theme_minimal()

Detecting roll-call votes in parliamentary speeches

urls <- speech_url(chamber = "D", days = c("2002-06-12", "2004-04-14"))
rollcall <- speech_rollcall(file =  urls)

rollcall

summary(rollcall)

Citation

To cite packagespeech in publications, please use:

citation(package = 'speech')

Maintainer

Nicolas Schmidt (nschmidt@cienciassociales.edu.uy)



Nicolas-Schmidt/speech documentation built on July 4, 2023, 4:32 p.m.