Analyze and have fun with the text from the best series of all time
You can install the released version of schrute from CRAN with:
install.packages("schrute")
The schrute package has one and only one purpose: share the complete script transcription for The Office (US) television show. Users are encouraged to use the tidy text data for exploration, learning and fun.
Check out the data like so:
library(schrute)
library(tibble)
#> Warning: package 'tibble' was built under R version 4.1.3
tibble::glimpse(schrute::theoffice)
#> Rows: 55,130
#> Columns: 12
#> $ index <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16…
#> $ season <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ episode <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
#> $ episode_name <chr> "Pilot", "Pilot", "Pilot", "Pilot", "Pilot", "Pilot",…
#> $ director <chr> "Ken Kwapis", "Ken Kwapis", "Ken Kwapis", "Ken Kwapis…
#> $ writer <chr> "Ricky Gervais;Stephen Merchant;Greg Daniels", "Ricky…
#> $ character <chr> "Michael", "Jim", "Michael", "Jim", "Michael", "Micha…
#> $ text <chr> "All right Jim. Your quarterlies look very good. How …
#> $ text_w_direction <chr> "All right Jim. Your quarterlies look very good. How …
#> $ imdb_rating <dbl> 7.6, 7.6, 7.6, 7.6, 7.6, 7.6, 7.6, 7.6, 7.6, 7.6, 7.6…
#> $ total_votes <int> 3706, 3706, 3706, 3706, 3706, 3706, 3706, 3706, 3706,…
#> $ air_date <chr> "2005-03-24", "2005-03-24", "2005-03-24", "2005-03-24…
Or view the short vignette with:
vignette("theoffice")
Julia Silge and David Robinson, creators of the tidyText package both used the {schrute} package for a #tidyTuesday analysis. Watch their videos and learn from the masters:
This dataset is also available in python and julia
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.