BTM: Biterm Topic Models for Short Text

Biterm Topic Models find topics in collections of short texts. It is a word co-occurrence based topic model that learns topics by modeling word-word co-occurrences patterns which are called biterms. This in contrast to traditional topic models like Latent Dirichlet Allocation and Probabilistic Latent Semantic Analysis which are word-document co-occurrence topic models. A biterm consists of two words co-occurring in the same short text window. This context window can for example be a twitter message, a short answer on a survey, a sentence of a text or a document identifier. The techniques are explained in detail in the paper 'A Biterm Topic Model For Short Text' by Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng (2013) <https://github.com/xiaohuiyan/xiaohuiyan.github.io/blob/master/paper/BTM-WWW13.pdf>.

Getting started

Package details

AuthorJan Wijffels [aut, cre, cph] (R wrapper), BNOSAC [cph] (R wrapper), Xiaohui Yan [ctb, cph] (BTM C++ library)
MaintainerJan Wijffels <jwijffels@bnosac.be>
LicenseApache License 2.0
Version0.3.7
URL https://github.com/bnosac/BTM
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:
install.packages("BTM")

Try the BTM package in your browser

Any scripts or data that you put into this service are public.

BTM documentation built on Feb. 16, 2023, 10:14 p.m.