newsgroups: Topic modeling results from the "20 Newsgroups" data set.

newsgroupsR Documentation

Topic modeling results from the “20 Newsgroups” data set.

Description

These are topic modeling results from the “20 Newsgroups” data, with k = 10 topics. The data were originally downloaded from http://qwone.com/~jason/20Newsgroups and prepared by running code that found in an R Markdown file in this GitHub repository: https://github.com/stephenslab/fastTopics-experiments. See the “inst” directory of this package for the scripts used to generate these results.

Format

newsgroups is a list with the following elements:

topics

Original labeling of the documents: each document is from one of 20 “newsgroups”.

L

Estimated topic proportions matrix; rows are documents and columns are topics.

F

Matrix containing posterior mean estimates of log-fold changes (in base-2 logarithm). These were computed using de_analysis with lfc.stat = "vsnull". Columns are words and columns are topics.

Examples

data(newsgroups)
table(newsgroups$topics)
dim(newsgroups$L)
dim(newsgroups$F)


stephenslab/fastTopics documentation built on March 29, 2025, 3:24 p.m.