Description Usage Arguments Value Note See Also Examples
Converts the documents read using read_docs
into two vectors:
one vector for the document word instances (contains vocabulary id's) and the
other vector for the corresponding document id's.
1 | vectorize_docs(docs)
|
docs |
a list of documents, which is created using
|
A list of document and word instances
This method is very time consuming for large datasets. Therefore, use
functions such as lda_fgs_blei_corpus
, which take docs
as input and do the job of this function in the C++ programming langauge,
for Gibbs sampling.
Other lda data preprocessing methods: calc_doc_lengths
,
read_docs
1 2 | documents <- read_docs('bop.ldac');
ds <- vectorize_docs(documents);
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.