inst/tag/shiny/help/data/transform.md

Basic Transformations

This sub-tab offers basic text preprocessing transformation. For stopword removal and more general word list exclusion, see the 'Filter' sub-tab of the 'Data' tab.

Make Lowercase

Make the text uniformly lowercase.

Remove Punctuation

Removes all punctuation from the text.

Remove Numbers

Remove all numbers from the text. Only counts digits (0-9), and not for example Roman numerals.

Remove Extra Whitespace

Multiple, consecutive whitespace characters are transformed to a single blank space.

Stem

Stems the text. 'Stemming' is the process of transforming a derived word to its base 'stem'. For example, stemming the words

and so on, would all reduce to the word 'stem'.



XSEDEScienceGateways/TAG documentation built on May 9, 2019, 11:05 p.m.