textTinyR: Text Processing for Small or Big Data Files
Version 1.0.8

Processes big text data files in batches efficiently. For this purpose, it offers functions for splitting, parsing, tokenizing and creating a vocabulary. Moreover, it includes functions for building either a document-term matrix or a term-document matrix and extracting information from those (term-associations, most frequent terms). Lastly, it embodies functions for calculating token statistics (collocations, look-up tables, string dissimilarities) and functions to work with sparse matrices. The source code is based on 'C++11' and exported in R through the 'Rcpp', 'RcppArmadillo' and 'BH' packages.

Package details

AuthorLampros Mouselimis <[email protected]>
Date of publication2017-10-31 21:11:25 UTC
MaintainerLampros Mouselimis <[email protected]>
LicenseGPL-3
Version1.0.8
URL https://github.com/mlampros/textTinyR
Package repositoryView on CRAN
Installation Install the latest version of this package by entering the following in R:
install.packages("textTinyR")

Try the textTinyR package in your browser

Any scripts or data that you put into this service are public.

textTinyR documentation built on Nov. 17, 2017, 7:37 a.m.