README.md

tosca

Tools for Statistical Content Analysis created at TU Dortmund University.

About

tosca is a framework for statistical methods in content analysis. We offer a pipeline for preprocessing, model text corpora using a link to the implemantation of Latent Dirichlet Allocation from the lda package. Useful plot routines for both - pre- and post-modeled corpora - are given for the descriptive analysis of text corpora and topic models. Moreover, an implementation of Chang's intruder words and intruder topics is provided; as well as reasoned sampling of text ids to get effective sets of texts for human labeling/coding regarding accuracy of estimating Precision and Recall.

Installation

See examples how to use tosca at the Vignette.

Citation

For a BibTeX entry please use citation(package = "tosca").

Contribution

This R package is licensed under the GPLv3. For wishes, issues, and bugs please use the issue tracker.

Build Status Coverage Status CRAN Status Badge CRAN Downloads Total Downloads DOI



Docma-TU/tosca documentation built on June 2, 2025, 3:11 a.m.