A subset of the Cora dataset of scientific documents.
A collection of 2410 scientific documents in LDA format with links and titles from the Cora search engine.
1 2 3 4
comprise a corpus of 2410 documents conforming to the LDA format.
cora.titles is a character vector of titles for each
document (i.e., each entry of
cora.cites is a list representing the citations between the
documents in the collection (see related for format).
Automating the construction of internet protals with machine learning. McCallum et al. Information Retrieval. 2000.
lda.collapsed.gibbs.sampler for the format of the
rtm.collapsed.gibbs.sampler for the format of the
1 2 3 4
Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.