knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The spRingsteen package provides a number of dataframes describing the songs,
albums, tours, and setlists of Bruce Springsteen's career. The data (collected from Brucebase) is provided
in a tidy form which is easily analyzed in R
. The scripts which are used to scrape the data in their entirety, alongside a SQLite representation of the data may be viewed at a second repository springsteen_db
.
You can install the released version of spRingsteen from CRAN with:
install.packages("spRingsteen")
Alternatively, you can install the development version of spRingsteen from GitHub like so:
remotes::install_github("obrienjoey/spRingsteen")
While the spRingsteen CRAN version is updated every few months, the Github (Dev) version is updated on a daily basis. The update_data
function enables to overcome this gap and keep the installed version with the most recent data available on the Github version:
library(spRingsteen) update_data()
Note: must restart the R session to have the updates available
The package includes datasets around the career of Bruce Springsteen. For example,
the touring history of him and his numerous bands is stored in concerts
:
library(spRingsteen) library(dplyr) concerts # how many concerts have occurred in each country? concerts %>% count(country, sort = TRUE)
It also has information of the setlists performed in these shows which are
stored in setlists
.
setlists # what song has been played most by Springsteen? setlists %>% count(song, sort = TRUE) # which song has most frequently opened a show? setlists %>% filter(song_number == 1) %>% count(song, sort = TRUE) %>% slice(1)
Further details of the songs themselves are available in songs
, including
the album of appearance and also the full lyrics in some cases. This allows for
some text mining or sentiment analysis using a package like tidytext.
library(tidytext) # what word appears most frequently in the **Born in the U.S.A** album? songs %>% filter(album == "Born In The U.S.A.") %>% select(title, lyrics) %>% unnest_tokens(word, lyrics) %>% count(word, sort = TRUE) %>% anti_join(stop_words, by = 'word')
Lastly, the tour
table contains the tours associated with each concert.
tours %>% count(tour, sort = TRUE)
Of course the real advantage of this package is in combining the different dataframes in order to infer useful information:
# what was the most played song on each tour? setlists %>% left_join(tours, by = 'gig_key') %>% count(song, tour) %>% group_by(tour) %>% filter(n == max(n)) %>% arrange(desc(tour))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.