clean.text: Clean text and get it ready for textreg.

Description Usage Arguments Examples

Description

Changes multiline documents to single line. Strips extra whitespace and punctuation. Changes digits to 'X's. Non-alpha characters converted to spaces.

Usage

1
clean.text(bigcorp)

Arguments

bigcorp

A tm Corpus object.

Examples

1
2
3
4
5
library( tm )
txt = c( "thhis s! and bonkus  4:33pm and Jan 3, 2015. ", 
         "   big    space\n     dawg-ness?")
a <- clean.text( VCorpus( VectorSource( txt ) ) )
a[[1]]


Search within the textreg package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.