white: Strip White Spaces

View source: R/white.R

whiteR Documentation

Strip White Spaces

Description

The function white collapse all multiple "spaces" in a single space. By default the function identifies a white space by \s+ which is a shortcut for [^[:space:]] i.e. tab, newline, vertical tab, form feed, carriage return, space and possibly other locale-dependent characters. There is option to override the pattern to be use to identify white spaces.

Usage

white(corpus, ..., pattern = "\\s+")

## S3 method for class 'list'
white(corpus, ..., pattern = NULL)

## S3 method for class 'character'
white(corpus, ..., pattern = NULL)

## S3 method for class 'VCorpus'
white(corpus, ..., pattern = NULL)

## Default S3 method:
white(corpus, ..., pattern = NULL)

Arguments

corpus

a compatible object storing documents (actually, list (and corpus-list of (tokened) documents, character vectors and VCorpus)

...

Other paramenter

pattern

(chr) A regular expression to be use for detection of withespace. If NULL (default), \s+ is used.

Value

an object of the same class of input with documents witten with trimmed whitespaces.

Examples

data(liu_corpus)

white(c(' one  two   three    '))
white(liu_corpus)

UBESP-DCTV/costumer documentation built on Feb. 1, 2023, 4:52 a.m.