is_punctuation: Check whether 'char' is a punctuation character.

View source: R/tokenization.R

is_punctuationR Documentation

Check whether 'char' is a punctuation character.

Description

(R implementation of _is_punctuation from BERT: tokenization.py.)

Usage

is_punctuation(char)

Arguments

char

A character scalar, comprising a single unicode character.

Details

We treat all non-letter/number ASCII as punctuation. Characters such as "^", "$", and "'" are not in the Unicode Punctuation class but we treat them as punctuation anyway, for consistency.

Value

TRUE if char is a punctuation character.


jonathanbratt/RBERT documentation built on Jan. 26, 2023, 4:15 p.m.