tm: Text Mining Package

Share:

A framework for text mining applications within R.

Author
Ingo Feinerer [aut, cre], Kurt Hornik [aut], Artifex Software, Inc. [ctb, cph] (pdf_info.ps taken from GPL Ghostscript)
Date of publication
2016-11-02 15:41:49
Maintainer
Ingo Feinerer <feinerer@logic.at>
License
GPL-3
Version
0.7
URLs

View on R-Forge

Man pages

acq
50 Exemplary News Articles from the Reuters-21578 Data Set of...
combine
Combine Corpora, Documents, Term-Document Matrices, and Term...
content_transformer
Content Transformers
Corpus
Corpora
crude
20 Exemplary News Articles from the Reuters-21578 Data Set of...
DataframeSource
Data Frame Source
DirSource
Directory Source
Docs
Access Document IDs and Terms
findAssocs
Find Associations in a Term-Document Matrix
findFreqTerms
Find Frequent Terms
foreign
Read Document-Term Matrices
getTokenizers
Tokenizers
getTransformations
Transformations
inspect
Inspect Objects
matrix
Term-Document Matrix
meta
Metadata Management
PCorpus
Permanent Corpora
PlainTextDocument
Plain Text Documents
plot
Visualize a Term-Document Matrix
readDOC
Read In a MS Word Document
Reader
Readers
readPDF
Read In a PDF Document
readPlain
Read In a Text Document
readRCV1
Read In a Reuters Corpus Volume 1 Document
readReut21578XML
Read In a Reuters-21578 XML Document
readTabular
Read In a Text Document
readTagged
Read In a POS-Tagged Word Text Document
readXML
Read In an XML Document
removeNumbers
Remove Numbers from a Text Document
removePunctuation
Remove Punctuation Marks from a Text Document
removeSparseTerms
Remove Sparse Terms from a Term-Document Matrix
removeWords
Remove Words from a Text Document
SimpleCorpus
Simple Corpora
Source
Sources
stemCompletion
Complete Stems
stemDocument
Stem Words
stopwords
Stopwords
stripWhitespace
Strip Whitespace from a Text Document
termFreq
Term Frequency Vector
TextDocument
Text Documents
tm_filter
Filter and Index Functions on Corpora
tm_map
Transformations on Corpora
tm_reduce
Combine Transformations
tm_term_score
Compute Score for Matching Terms
tokenizer
Tokenizers
URISource
Uniform Resource Identifier Source
VCorpus
Volatile Corpora
VectorSource
Vector Source
weightBin
Weight Binary
WeightFunction
Weighting Function
weightSMART
SMART Weightings
weightTf
Weight by Term Frequency
weightTfIdf
Weight by Term Frequency - Inverse Document Frequency
writeCorpus
Write a Corpus to Disk
XMLSource
XML Source
XMLTextDocument
XML Text Documents
Zipf_n_Heaps
Explore Corpus Term Frequency Characteristics
ZipSource
ZIP File Source

Files in this package

tm/DESCRIPTION
tm/NAMESPACE
tm/R
tm/R/RcppExports.R
tm/R/complete.R
tm/R/corpus.R
tm/R/doc.R
tm/R/filter.R
tm/R/foreign.R
tm/R/matrix.R
tm/R/meta.R
tm/R/pdftools.R
tm/R/plot.R
tm/R/reader.R
tm/R/score.R
tm/R/source.R
tm/R/stopwords.R
tm/R/tokenizer.R
tm/R/transform.R
tm/R/utils.R
tm/R/weight.R
tm/build
tm/build/vignette.rds
tm/data
tm/data/acq.rda
tm/data/crude.rda
tm/inst
tm/inst/CITATION
tm/inst/NEWS.Rd
tm/inst/doc
tm/inst/doc/extensions.R
tm/inst/doc/extensions.Rnw
tm/inst/doc/extensions.pdf
tm/inst/doc/tm.R
tm/inst/doc/tm.Rnw
tm/inst/doc/tm.pdf
tm/inst/ghostscript
tm/inst/ghostscript/pdf_info.ps
tm/inst/stopwords
tm/inst/stopwords/SMART.dat
tm/inst/stopwords/catalan.dat
tm/inst/stopwords/danish.dat
tm/inst/stopwords/dutch.dat
tm/inst/stopwords/english.dat
tm/inst/stopwords/finnish.dat
tm/inst/stopwords/french.dat
tm/inst/stopwords/german.dat
tm/inst/stopwords/hungarian.dat
tm/inst/stopwords/italian.dat
tm/inst/stopwords/norwegian.dat
tm/inst/stopwords/portuguese.dat
tm/inst/stopwords/romanian.dat
tm/inst/stopwords/russian.dat
tm/inst/stopwords/spanish.dat
tm/inst/stopwords/swedish.dat
tm/inst/texts
tm/inst/texts/acq
tm/inst/texts/acq/reut-00001.xml
tm/inst/texts/acq/reut-00002.xml
tm/inst/texts/acq/reut-00003.xml
tm/inst/texts/acq/reut-00004.xml
tm/inst/texts/acq/reut-00005.xml
tm/inst/texts/acq/reut-00006.xml
tm/inst/texts/acq/reut-00007.xml
tm/inst/texts/acq/reut-00008.xml
tm/inst/texts/acq/reut-00009.xml
tm/inst/texts/acq/reut-00010.xml
tm/inst/texts/acq/reut-00011.xml
tm/inst/texts/acq/reut-00012.xml
tm/inst/texts/acq/reut-00013.xml
tm/inst/texts/acq/reut-00014.xml
tm/inst/texts/acq/reut-00015.xml
tm/inst/texts/acq/reut-00016.xml
tm/inst/texts/acq/reut-00017.xml
tm/inst/texts/acq/reut-00018.xml
tm/inst/texts/acq/reut-00020.xml
tm/inst/texts/acq/reut-00021.xml
tm/inst/texts/acq/reut-00022.xml
tm/inst/texts/acq/reut-00023.xml
tm/inst/texts/acq/reut-00024.xml
tm/inst/texts/acq/reut-00025.xml
tm/inst/texts/acq/reut-00026.xml
tm/inst/texts/acq/reut-00027.xml
tm/inst/texts/acq/reut-00028.xml
tm/inst/texts/acq/reut-00029.xml
tm/inst/texts/acq/reut-00030.xml
tm/inst/texts/acq/reut-00031.xml
tm/inst/texts/acq/reut-00032.xml
tm/inst/texts/acq/reut-00034.xml
tm/inst/texts/acq/reut-00035.xml
tm/inst/texts/acq/reut-00036.xml
tm/inst/texts/acq/reut-00039.xml
tm/inst/texts/acq/reut-00040.xml
tm/inst/texts/acq/reut-00042.xml
tm/inst/texts/acq/reut-00043.xml
tm/inst/texts/acq/reut-00045.xml
tm/inst/texts/acq/reut-00046.xml
tm/inst/texts/acq/reut-00047.xml
tm/inst/texts/acq/reut-00048.xml
tm/inst/texts/acq/reut-00049.xml
tm/inst/texts/acq/reut-00050.xml
tm/inst/texts/acq/reut-00051.xml
tm/inst/texts/acq/reut-00052.xml
tm/inst/texts/acq/reut-00053.xml
tm/inst/texts/acq/reut-00054.xml
tm/inst/texts/acq/reut-00055.xml
tm/inst/texts/acq/reut-00056.xml
tm/inst/texts/crude
tm/inst/texts/crude/reut-00001.xml
tm/inst/texts/crude/reut-00002.xml
tm/inst/texts/crude/reut-00004.xml
tm/inst/texts/crude/reut-00005.xml
tm/inst/texts/crude/reut-00006.xml
tm/inst/texts/crude/reut-00007.xml
tm/inst/texts/crude/reut-00008.xml
tm/inst/texts/crude/reut-00009.xml
tm/inst/texts/crude/reut-00010.xml
tm/inst/texts/crude/reut-00011.xml
tm/inst/texts/crude/reut-00012.xml
tm/inst/texts/crude/reut-00013.xml
tm/inst/texts/crude/reut-00014.xml
tm/inst/texts/crude/reut-00015.xml
tm/inst/texts/crude/reut-00016.xml
tm/inst/texts/crude/reut-00018.xml
tm/inst/texts/crude/reut-00019.xml
tm/inst/texts/crude/reut-00021.xml
tm/inst/texts/crude/reut-00022.xml
tm/inst/texts/crude/reut-00023.xml
tm/inst/texts/custom.xml
tm/inst/texts/loremipsum.txt
tm/inst/texts/rcv1_2330.xml
tm/inst/texts/reuters-21578.xml
tm/inst/texts/txt
tm/inst/texts/txt/ovid_1.txt
tm/inst/texts/txt/ovid_2.txt
tm/inst/texts/txt/ovid_3.txt
tm/inst/texts/txt/ovid_4.txt
tm/inst/texts/txt/ovid_5.txt
tm/man
tm/man/Corpus.Rd
tm/man/DataframeSource.Rd
tm/man/DirSource.Rd
tm/man/Docs.Rd
tm/man/PCorpus.Rd
tm/man/PlainTextDocument.Rd
tm/man/Reader.Rd
tm/man/SimpleCorpus.Rd
tm/man/Source.Rd
tm/man/TextDocument.Rd
tm/man/URISource.Rd
tm/man/VCorpus.Rd
tm/man/VectorSource.Rd
tm/man/WeightFunction.Rd
tm/man/XMLSource.Rd
tm/man/XMLTextDocument.Rd
tm/man/ZipSource.Rd
tm/man/Zipf_n_Heaps.Rd
tm/man/acq.Rd
tm/man/combine.Rd
tm/man/content_transformer.Rd
tm/man/crude.Rd
tm/man/findAssocs.Rd
tm/man/findFreqTerms.Rd
tm/man/foreign.Rd
tm/man/getTokenizers.Rd
tm/man/getTransformations.Rd
tm/man/inspect.Rd
tm/man/matrix.Rd
tm/man/meta.Rd
tm/man/plot.Rd
tm/man/readDOC.Rd
tm/man/readPDF.Rd
tm/man/readPlain.Rd
tm/man/readRCV1.Rd
tm/man/readReut21578XML.Rd
tm/man/readTabular.Rd
tm/man/readTagged.Rd
tm/man/readXML.Rd
tm/man/removeNumbers.Rd
tm/man/removePunctuation.Rd
tm/man/removeSparseTerms.Rd
tm/man/removeWords.Rd
tm/man/stemCompletion.Rd
tm/man/stemDocument.Rd
tm/man/stopwords.Rd
tm/man/stripWhitespace.Rd
tm/man/termFreq.Rd
tm/man/tm_filter.Rd
tm/man/tm_map.Rd
tm/man/tm_reduce.Rd
tm/man/tm_term_score.Rd
tm/man/tokenizer.Rd
tm/man/weightBin.Rd
tm/man/weightSMART.Rd
tm/man/weightTf.Rd
tm/man/weightTfIdf.Rd
tm/man/writeCorpus.Rd
tm/src
tm/src/RcppExports.cpp
tm/src/copy.c
tm/src/tdm.cpp
tm/vignettes
tm/vignettes/extensions.Rnw
tm/vignettes/references.bib
tm/vignettes/tm.Rnw