General Conference is a semi-annual event where members of The Church of Jesus Christ of Latter-day Saints gather to listen to church prophets, apostles, and other leaders.
This package both scrapes General Conference talks and provides all talks in a data package for analysis in R.
# install.packages('devtools')
devtools::install_github("bryanwhiting/generalconference")
Load the package:
library(generalconference)
Load the General Conference corpus, which is a tibble with nested data for each conference, session, talk, and paragraph.
data("genconf")
head(genconf)
#> # A tibble: 6 × 4
#> year month date sessions
#> <dbl> <dbl> <date> <list>
#> 1 2021 4 2021-04-01 <tibble [5 × 4]>
#> 2 2020 10 2020-10-01 <tibble [5 × 4]>
#> 3 2020 4 2020-04-01 <tibble [5 × 4]>
#> 4 2019 10 2019-10-01 <tibble [5 × 4]>
#> 5 2019 4 2019-04-01 <tibble [5 × 4]>
#> 6 2018 10 2018-10-01 <tibble [5 × 4]>
Unnest it to analyze individual talks, which can be unnested further to the paragraph level.
library(dplyr)
library(tidyr)
genconf %>%
tidyr::unnest(sessions) %>%
tidyr::unnest(talks) %>%
head()
#> # A tibble: 6 × 14
#> year month date session_name session_id session_url talk_urls
#> <dbl> <dbl> <date> <chr> <int> <chr> <chr>
#> 1 2021 4 2021-04-01 Saturday Morn… 1 /study/general-… /study/gene…
#> 2 2021 4 2021-04-01 Saturday Morn… 1 /study/general-… /study/gene…
#> 3 2021 4 2021-04-01 Saturday Morn… 1 /study/general-… /study/gene…
#> 4 2021 4 2021-04-01 Saturday Morn… 1 /study/general-… /study/gene…
#> 5 2021 4 2021-04-01 Saturday Morn… 1 /study/general-… /study/gene…
#> 6 2021 4 2021-04-01 Saturday Morn… 1 /study/general-… /study/gene…
#> # … with 7 more variables: talk_session_id <int>, url <chr>, title1 <chr>,
#> # author1 <chr>, author2 <chr>, kicker1 <chr>, paragraphs <list>
Analyze individual paragraphs that contain the word “faith”:
library(gt)
genconf %>%
# unpack/unnest the dataframe, which is a tibble of lists
tidyr::unnest(sessions) %>%
tidyr::unnest(talks) %>%
tidyr::unnest(paragraphs) %>%
# extract just the date, title, author and paragraph
# date, title, and author will be repeated fields, with paragraph unique
select(date, title1, author1, paragraph) %>%
# Filter to just the paragraphs that mention the word "faith"
filter(stringr::str_detect(paragraph, "faith")) %>%
# take top 5 records
head(5) %>%
# convert into a gt() table with row groups for date/title/author
# (use row groups since these data are replicated by paragraph)
group_by(date, title1, author1) %>%
gt() %>%
tab_options(
row_group.background.color = 'lightgray'
) %>%
tab_header(
title='Paragraphs on Faith',
subtitle='Grouped by talk'
)
See documentation for scrapers, if you need them. But you shouldn’t need them since all the data is available.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.