primary_topics: Document counts and mean topic shares of the three primary...

Description Usage Arguments Value

View source: R/topic-analysis.R

Description

primary_topics summarizes for each topic the number of documents and the respective mean topic share (gamma) where a topic is one of the three primary topics in a document.

Usage

1
primary_topics(topicsByDocDate, minGamma = 0)

Arguments

topicsByDocDate

a dataframe as returned by topics_by_doc_date

minGamma

the minimum share of a topic per document to be considered when summarizing primary topic information; topics with smaller shares per individual document will be ignored when summarizing the document counts and mean topic shares. (In an stm topic model the likelihood that a topic is generated from a topic is expressed by the value gamma.) The default is 0, thus ensuring that three topics are included for each document.

Value

a dataframe with 7 columns where:

topic_id

a topic ID as provided as an input in topicsByDocDate

n_docs_1

number of documents where topic_id has the largest probability

n_docs_2

number of documents where topic_id has the second largest probability

n_docs_3

number of documents where topic_id has the third largest probability

mean_gamma_1

mean probability of all documents in n_docs_1

mean_gamma_2

mean probability of all documents in n_docs_2

mean_gamma_3

mean probability of all documents in n_docs_3


sdaume/topicsplorrr documentation built on Dec. 22, 2021, 11:11 p.m.