knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%", warning=FALSE, message=FALSE )
The goal of gpt2samples is to help users explore the various sample texts as generated by Open AI's new GPT2 transformer based language model.
An original implementation of a smaller version of GPT-2 can be found here, and the original sample text files can be found here.
This package contains the following data, stored as tibbles:
|tibble |description |
|:--------------------|:-------------------------------------------------------------------------------------------------------------------------------------|
|conditional-t07 |Conditionally generated samples, with context prompts from WebText
test corpus, default settings (temperature 1 and no truncation). |
|conditional-topk40 |Conditionally generated samples, with context prompts from WebText
test corpus, with temperature 0.7 |
|conditional |Conditionally generated samples, with context prompts from WebText
test corpus, with truncation and top_k 40. |
|unconditional |Unconditionally generated samples, default settings. |
|unconditional-t07 |Unconditionally generated samples, with temperature 0.7 |
|unconditional-topk40 |Unconditionally generated samples, with truncation and top_k 40.
Additionally, all the generated samples (conditional and unconditional) can be explored by calling all_samples()
.
You can install the released version of gpt2samples from GitHub with:
# install.packages("gpt2samples") # install.packages("devtools") devtools::install_github("kanishkamisra/gpt2samples")
This is a basic example to explore the data using dplyr verbs
library(dplyr) library(gpt2samples) conditional %>% filter(id == 100) unconditional_t07 %>% filter(id == 250) all_samples() %>% filter(file == "conditional") %>% tail() all_samples() %>% group_by(file) %>% summarise(total_lines = n())
Additional exploration can use Julia Silge and David Robinson's tidytext
package, among others to analyze the generated text as produced by GPT-2.
Please note that the 'gpt2samples' project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.