Description Usage Arguments Details Value See Also
Creating and accessing sources.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | SimpleSource(encoding = "",
length = 0,
position = 0,
reader = readPlain,
...,
class)
getSources()
## S3 method for class 'SimpleSource'
close(con, ...)
## S3 method for class 'SimpleSource'
eoi(x)
## S3 method for class 'DataframeSource'
getMeta(x)
## S3 method for class 'DataframeSource'
getElem(x)
## S3 method for class 'DirSource'
getElem(x)
## S3 method for class 'URISource'
getElem(x)
## S3 method for class 'VectorSource'
getElem(x)
## S3 method for class 'XMLSource'
getElem(x)
## S3 method for class 'SimpleSource'
length(x)
## S3 method for class 'SimpleSource'
open(con, ...)
## S3 method for class 'DataframeSource'
pGetElem(x)
## S3 method for class 'DirSource'
pGetElem(x)
## S3 method for class 'URISource'
pGetElem(x)
## S3 method for class 'VectorSource'
pGetElem(x)
## S3 method for class 'SimpleSource'
reader(x)
## S3 method for class 'SimpleSource'
stepNext(x)
|
x |
A |
con |
A |
encoding |
a character giving the encoding of the elements delivered by the source. |
length |
a non-negative integer denoting the number of elements delivered
by the source. If the length is unknown in advance set it to |
position |
a numeric indicating the current position in the source. |
reader |
a reader function (generator). |
... |
For |
class |
a character vector giving additional classes to be used for the created source. |
Sources abstract input locations, like a directory, a connection, or
simply an R vector, in order to acquire content in a uniform way. In packages
which employ the infrastructure provided by package tm, such sources are
represented via the virtual S3 class Source
: such packages then provide
S3 source classes extending the virtual base class (such as
DirSource
provided by package tm itself).
All extension classes must provide implementations for the functions
close
, eoi
, getElem
, length
, open
,
reader
, and stepNext
. For parallel element access the
(optional) function pGetElem
must be provided as well. If
document level metadata is available, the (optional) function getMeta
must be implemented.
The functions open
and close
open and close the source,
respectively. eoi
indicates end of input. getElem
fetches the
element at the current position, whereas pGetElem
retrieves all
elements in parallel at once. The function length
gives the number of
elements. reader
returns a default reader for processing elements.
stepNext
increases the position in the source to acquire the next
element.
The function SimpleSource
provides a simple reference implementation
and can be used when creating custom sources.
For SimpleSource
, an object inheriting from class
,
SimpleSource
, and Source
.
For getSources
, a character vector with sources provided by package
tm.
open
and close
return the opened and closed source,
respectively.
For eoi
, a logical indicating if the end of input of the source is
reached.
For getElem
a named list with the components content
holding the
document and uri
giving a uniform resource identifier (e.g., a file
path or URL; NULL
if not applicable or unavailable). For
pGetElem
a list of such named lists.
For length
, an integer for the number of elements.
For reader
, a function for the default reader.
DataframeSource
, DirSource
,
URISource
, VectorSource
, and
XMLSource
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.