DataframeSource | R Documentation |
Create a data frame source.
DataframeSource(x)
x |
A data frame giving the texts and metadata. |
A data frame source interprets each row of the data frame x
as a
document. The first column must be named "doc_id"
and contain a unique
string identifier for each document. The second column must be named
"text"
and contain a UTF-8 encoded string representing the
document's content. Optional additional columns are used as document level
metadata.
An object inheriting from DataframeSource
, SimpleSource
,
and Source
.
Source
for basic information on the source infrastructure
employed by package tm, and meta
for types of metadata.
readtext
for reading in a text in multiple formats
suitable to be processed by DataframeSource
.
docs <- data.frame(doc_id = c("doc_1", "doc_2"),
text = c("This is a text.", "This another one."),
dmeta1 = 1:2, dmeta2 = letters[1:2],
stringsAsFactors = FALSE)
(ds <- DataframeSource(docs))
x <- Corpus(ds)
inspect(x)
meta(x)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.