read_ldnws: Read the Livedoor News Corpus

View source: R/ldnws-reader.R

read_ldnwsR Documentation

Read the Livedoor News Corpus

Description

Download and read the Livedoor News Corpus. The result of this function is memoised with memoise::memoise internally.

Usage

read_ldnws(
  url = "https://www.rondhuit.com/download/ldcc-20140209.tar.gz",
  exdir = tempdir(),
  keep = ldnws_categories(),
  collapse = "\n\n",
  include_title = TRUE
)

Arguments

url

String. If left to NULL, the function will skip downloading the file.

exdir

String. Path to tempolarily untar text files.

keep

Character vector. Categories to parse and keep in data.frame.

collapse

String with which base::paste collapses lines.

include_title

Logical. Whether to include title in text body field. Defaults to TRUE.

Details

This function downloads the Livedoor News Corpus and parses it to a tibble. For details about the Livedoor News Corpus, please see this page.

Value

A tibble.


paithiov909/ldccr documentation built on Oct. 14, 2024, 3:44 a.m.