st_transcripts: Import transcripts
In rtrek: Data Analysis Relating to Star Trek

st_transcripts

R Documentation

Import transcripts

Description

Download a curated data frame based on episode and movie transcripts containing metadata and variables for analysis of scenes, character presence, dialog, sentiment, etc.

Usage

st_transcripts(type = c("clean", "raw"))

Arguments

type

character, "clean" for curated nested data frame or "raw" for unprocessed text. See details.

Details

The data frame contains metadata associated with each transcript, one row per episode. It also contains a list column. By default (type = "clean"), this is a nested data frame of preprocessed text split into several variables including the speaking character, line spoken, scene descriptions, etc. For the raw text version, the list column contains vectors of unprocessed plain text.

Metadata includes the format (episode or movie), series, season, overall episode number, title, production order and original airdate if available and applicable. The two columns url and url2 show where source material can be browsed online, though not in a useful format for data analysis. The first set is used if possible because it contains more complete, higher quality data. When necessary, the derived data is based on text from the alternate source.

The dataset is nicely curated, but imperfect. There are text-parsing edge cases that are difficult to handle generally. The quality varies substantially across series. Datasets assembled based on original transcripts are more informative, but not universally available. Other episodes are based on transcripts derived from closed captioning, in which case more fields will contain NA values.

This function downloads and returns a sizable tibble data frame. Each version is about 13-15 MB compressed. The returned tibble contains 726 rows (716 episodes and 10 movies), but each row has nested data.

Value

a tibble data frame

Examples

## Not run: stTranscripts <- st_transcripts()

rtrek documentation built on June 20, 2025, 1:08 a.m.

rtrek index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

rtrek
Data Analysis Relating to Star Trek

st_transcripts: Import transcripts
In rtrek: Data Analysis Relating to Star Trek

Import transcripts

Description

Usage

Arguments

Details

Value

Examples

Related to st_transcripts in rtrek...

R Package Documentation

Browse R Packages

We want your feedback!

rtrek Data Analysis Relating to Star Trek

st_transcripts: Import transcripts In rtrek: Data Analysis Relating to Star Trek

Import transcripts

Description

Usage

Arguments

Details

Value

Examples

Related to st_transcripts in rtrek...

R Package Documentation

Browse R Packages

We want your feedback!

rtrek
Data Analysis Relating to Star Trek

st_transcripts: Import transcripts
In rtrek: Data Analysis Relating to Star Trek