URISource: Uniform Resource Identifier Source

View source: R/source.R

URISourceR Documentation

Uniform Resource Identifier Source

Description

Create a uniform resource identifier source.

Usage

URISource(x, encoding = "", mode = "text")

Arguments

x

A character vector of uniform resource identifiers (URIs.

encoding

A character string describing the current encoding. It is passed to iconv to convert the input to UTF-8.

mode

a character string specifying if and how URIs should be read in. Available modes are:

""

No read. In this case getElem and pGetElem only deliver URIs.

"binary"

URIs are read in binary raw mode (via readBin).

"text"

URIs are read as text (via readLines).

Details

A uniform resource identifier source interprets each URI as a document.

Value

An object inheriting from URISource, SimpleSource, and Source.

See Also

Source for basic information on the source infrastructure employed by package tm.

Encoding and iconv on encodings.

Examples

loremipsum <- system.file("texts", "loremipsum.txt", package = "tm")
ovid <- system.file("texts", "txt", "ovid_1.txt", package = "tm")
us <- URISource(sprintf("file://%s", c(loremipsum, ovid)))
inspect(VCorpus(us))

tm documentation built on Sept. 11, 2024, 6:47 p.m.