Description Usage Arguments Value Author(s) See Also Examples
Converts most common quanteda and tm corpus objects into a
sento_corpus
object. Appropriate available metadata is integrated as features;
for a quanteda corpus, this can come from docvars(x)
, for a tm corpus,
only meta(x, type = "indexed")
metadata is considered.
1 | as.sento_corpus(x, dates = NULL, do.clean = FALSE)
|
x |
a quanteda |
dates |
an optional sequence of dates as |
do.clean |
see |
A sento_corpus
object, as returned by the sento_corpus
function.
Samuel Borms
corpus
, SimpleCorpus
, VCorpus
,
sento_corpus
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | data("usnews", package = "sentometrics")
txt <- system.file("texts", "txt", package = "tm")
reuters <- system.file("texts", "crude", package = "tm")
# reshuffle usnews data.frame for use in quanteda and tm
dates <- usnews$date
usnews$wrong <- "notNumeric"
colnames(usnews)[c(1, 3)] <- c("doc_id", "text")
# conversion from a quanteda corpus
qcorp <- quanteda::corpus(usnews,
text_field = "text", docid_field = "doc_id")
corp1 <- as.sento_corpus(qcorp)
corp2 <- as.sento_corpus(qcorp, sample(dates)) # overwrites "date" column
# conversion from a tm SimpleCorpus corpus (DataframeSource)
tmSCdf <- tm::SimpleCorpus(tm::DataframeSource(usnews))
corp3 <- as.sento_corpus(tmSCdf)
# conversion from a tm SimpleCorpus corpus (DirSource)
tmSCdir <- tm::SimpleCorpus(tm::DirSource(txt))
corp4 <- as.sento_corpus(tmSCdir, dates[1:length(tmSCdir)])
# conversion from a tm VCorpus corpus (DataframeSource)
tmVCdf <- tm::VCorpus(tm::DataframeSource(usnews))
corp5 <- as.sento_corpus(tmVCdf)
# conversion from a tm VCorpus corpus (DirSource)
tmVCdir <- tm::VCorpus(tm::DirSource(reuters),
list(reader = tm::readReut21578XMLasPlain))
corp6 <- as.sento_corpus(tmVCdir, dates[1:length(tmVCdir)])
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.