Description Usage Arguments Value Examples
Convert a quanteda dfm or corpus object to a format useable by other
packages. The general function convert
provides easy conversion from a dfm
to the document-term representations used in all other text analysis packages
for which conversions are defined. For corpus objects, convert
provides
an easy way to make a corpus and its document variables into a data.frame.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | convert(x, to, ...)
## S3 method for class 'dfm'
convert(
x,
to = c("lda", "tm", "stm", "austin", "topicmodels", "lsa", "matrix", "data.frame",
"tripletlist"),
docvars = NULL,
omit_empty = TRUE,
docid_field = "doc_id",
...
)
## S3 method for class 'corpus'
convert(x, to = c("data.frame", "json"), pretty = FALSE, ...)
|
x |
a dfm or corpus to be converted |
to |
target conversion format, one of:
|
... |
unused directly |
docvars |
optional data.frame of document variables used as the
|
omit_empty |
logical; if |
docid_field |
character; the name of the column containing document
names used when |
pretty |
adds indentation whitespace to JSON output. Can be TRUE/FALSE or a number specifying the number of spaces to indent. See |
A converted object determined by the value of to
(see above).
See conversion target package documentation for more detailed descriptions
of the return formats.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ## convert a dfm
corp <- corpus_subset(data_corpus_inaugural, Year > 1970)
dfmat1 <- dfm(corp)
# austin's wfm format
identical(dim(dfmat1), dim(convert(dfmat1, to = "austin")))
# stm package format
stmmat <- convert(dfmat1, to = "stm")
str(stmmat)
# triplet
tripletmat <- convert(dfmat1, to = "tripletlist")
str(tripletmat)
## Not run:
# tm's DocumentTermMatrix format
tmdfm <- convert(dfmat1, to = "tm")
str(tmdfm)
# topicmodels package format
str(convert(dfmat1, to = "topicmodels"))
# lda package format
str(convert(dfmat1, to = "lda"))
## End(Not run)
## convert a corpus into a data.frame
corp <- corpus(c(d1 = "Text one.", d2 = "Text two."),
docvars = data.frame(dvar1 = 1:2, dvar2 = c("one", "two"),
stringsAsFactors = FALSE))
convert(corp, to = "data.frame")
convert(corp, to = "json")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.