bestDocs | R Documentation |
Find the documents in a corpus that have the most high frequency phrases and return a corpus with just those documents
bestDocs(co, num = 3L, n = 10L, pd = NULL)
co |
A corpus with documents |
num |
Integer with the number of documents to return |
n |
Integer with the number of high frequency phrases to use |
pd |
phraseDoc object for the corpus in |
A corpus with the num
documents that have the most
high frequency phrases, in order of the number of high frequency
phrases. The corpus returned will have the meta field oldIdx set
to the index of the document in the original corpus, and the meta
field hfPhrases to the number of high frequency phrases it contains.
v1=c("Here is some text to test phrase mining","phrase mining is fun",
"Some text is better than no text","No text, no phrase mining")
co=tm::VCorpus(tm::VectorSource(v1))
pd=phraseDoc(co,min.freq=2)
bestDocs(co,2,2,pd)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.