In dcardosos/speechbr: Access the Speechs and Speaker's Informations of House of Representatives of Brazil

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)

speechbr

Overview

The goal of {speechbr} is to democratize access to the speeches of the deputies, that is, their ideias and thoughts.

The data is obtained on Discursos e Notas Taquigráficas of Câmara dos Deputados.

Observation

The released version from CRAN is limited to speeches before 2022. For access speeches after 2021-12-31, use the development version.

Installation

You can install the released version of {speechbr} from CRAN with:

install.packages("speechbr")

You can install the development version of {speechbr} from GitHub with:

# install.packages("devtools")
devtools::install_github("dcardosos/speechbr")

Example

An example of a base searching for the term "tecnologia" between 2021-09-01 and 2021-10-01:

library(speechbr)

tab <- speechbr::speech_data(
  keyword = "tecnologia",
  start_date = "2021-09-01", 
  end_date = "2021-10-01")

dplyr::glimpse(tab)

The others parameters are party (political party), speaker (speaker's name) and uf (state acronym). Their default values are empty ("").

A simple application using the base, a wordcloud:

# install.package("wordlcoud2")
# install.package("tidytext")

stop_words <- tidytext::get_stopwords("pt")

others_words <- c("nao", "ter", "termos", "r", "fls", "sr", "ja", "sao",
                  "porque", "aqui","ha", "ser", "ano", "presidente", "tambem")

tab %>%
  tibble::rowid_to_column("id") %>%
  dplyr::select(id, discurso) %>%
  tidytext::unnest_tokens(word, discurso) %>%
  dplyr::filter(!grepl('[0-9]', word)) %>%
  dplyr::mutate(word = abjutils::rm_accent(word)) %>%
  dplyr::anti_join(stop_words) %>%
  dplyr::group_by(word) %>%
  dplyr::count(word, sort = TRUE) %>%
  dplyr::filter(n > 5, !word %in% others_words) %>% 
  wordcloud2::wordcloud2()