| top_docs | R Documentation | 
Extracts a data frame of documents scoring high in each topic. Documents are represented as numeric indices. The scoring is done on the basis of the document-topic matrix, but here some care is needed in deciding about cases in which a document has more of its words assigned to a given topic but a smaller proportion of that topic than some other, shorter document. By default all documents are normalized to length 1 before ranking here.
top_docs(m, n, ...)
| m | 
 | 
| n | number of top documents to extract | 
| weighting | a function to transform the document-topic matrix. By
default  | 
Note also that a topic may reach its maximum proportion in a document even if
that document has a yet larger proportion of another topic. To adjust the
scoring, pass a function to transform the document-topic matrix in the
weighting parameter. If you wish to use raw weights rather than
proportions to rank documents, set weighting=identity. Raw weights
give longer documents an unfair advantage, whereas proportions often give
shorter documents an advantage (because short documents tend to be dominated
by single topics in LDA).
TODO: alternative scoring methods.
a data frame with three columns, topic, doc, the
numerical index of the document in doc_ids(m), and
weight, the weight used in ranking (topic proportion, raw score,
...)
doc_topics, dt_smooth_normalize
## Not run: 
# obtain citations for 3 documents with highest proportions of topic 4
top_docs(m, 3) %>%
    filter(topic == 4) %>%
    select(-topic) %>%
    mutate(citation=cite_articles(metadata(m)[doc, ]))
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.