knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
library(readthat)

readthat

CRAN status Lifecycle: experimental Travis build status Codecov test coverage

Quickly read text/source from local files and web pages.

Installation

You can install the development released version of readthat from Github with:

remotes::install_github("mkearney/readthat")

Examples

Let's say we want to read-in the source of the following websites:

## a vector of URLs
urls <- c(
  "https://mikewk.com",
  "https://cnn.com",
  "https://www.cnn.com/us"
)

Use readthat::read() to read the text/source of a single file/URL

## read single web/file (returns text vector)
x <- read(urls[1])

## preview output
substr(x, 1, 60)

## use apply functions to read multiple pages
xx <- sapply(urls, read)

## preview output
lapply(xx, substr, 1, 60)

Comparisons

Benchmark comparison for reading a text file:

## save a text file
writeLines(read(urls[1]), x <- tempfile())

## coompare read times
bm_file <- bench::mark(
  readr = readr::read_lines(x),
  readthat = read(x),
  readLines = readLines(x),
  check = FALSE
)

## view results
bm_file
p1 <- ggplot2::autoplot(bm_file)
p1 + ggplot2::ggsave("man/figures/README-bm_file.png",
    width = 9, height = 5, units = "in")

Benchmark comparison for reading a web page:

x <- "https://www.espn.com/nfl/scoreboard"
bm_html <- bench::mark(
  httr = httr::content(httr::GET(x), as = "text", encoding = "UTF-8"),
  xml2 = xml2::read_html(x),
  readthat = read(x),
  readLines = readLines(x, warn = FALSE),
  readr = readr::read_lines(x),
  check = FALSE,
  iterations = 25,
  filter_gc = TRUE
)
bm_html
p2 <- ggplot2::autoplot(bm_html)
p2 + ggplot2::ggsave("man/figures/README-bm_html.png",
    width = 9, height = 5, units = "in")



mkearney/readthat documentation built on Nov. 4, 2019, 7 p.m.