Man pages for Docma-TU/tmT
Text Mining Tools For News Corpora Analysis

cleanTextsRemoves punctuation, numbers and stopwords, changes letters...
clusterTopicsCluster Analysis
deleteAndRenameDuplicatesDeletes and Renames Articles with the same ID
duplistCreating List of Duplicates
filterCountSubcorpus With Count Filter
filterDateSubcorpus With Date Filter
filterWordSubcorpus With Word Filter
intruderTopicsFunction to validate the fit of the LDA model
intruderWordsFunction to validate the fit of the LDA model
LDAgenFunction to fit LDA model
LDAprepCreate Lda-ready Dataset
makeWordlistCounts Words in Text Corpora
mergeLDAPreparation of Different LDAs For Clustering
mergeTextmetaMerge Textmeta Objects
plotAreaPlotting topics over time as stacked areas below plotted...
plotFreqPlotting Counts of specified Wordgroups over Time (relative...
plotHeatPlotting Topics over Time relative to Corpus
plotScotPlots Counts of Documents or Words over Time (relative to...
plotTopicPlotting Counts of Topics over Time (Relative to Corpus)
plotTopicWordPlotting Counts of Topics-Words-Combination over Time...
plotWordptPlots Counts of Topics-Words-Combination over Time (Relative...
plotWordSubPlotting Counts/Proportion of Words/Docs in LDA-generated...
readHBWiWoRead the HB WiWo Corpus
readJFArchivRead the Corpus as CSV
readNexisRead preprocessed files from Lexis Nexis
readNexisOnlineRead preprocessed files from Nexis Online
readSPIEGELRead the SPIEGEL Corpus
readSZRead the SZ corpus
readTextmetaRead Corpora as CSV
readWikiRead Pages from Wikipedia
readWikinewsRead files from Wikinews
removeXMLRemoves XML/HTML Tags and Umlauts
showMetaExport Readable Meta-Data of Articles.
showTextsExports Readable Text Lists
topicsInTextColoring the words of a text corresponding to topic...
topTextsGet The IDs Of The Most Representive Texts
wikinewsThe wikinews dataset
Docma-TU/tmT documentation built on May 17, 2018, 10:16 a.m.