space_cjk: Add Spaces Around CJK Ideographs

View source: R/space.R

space_cjkR Documentation

Add Spaces Around CJK Ideographs

Description

To tokenize Chinese, Japanese, and Korean (CJK) characters, it's convenient to add spaces around the characters.

Usage

space_cjk(text)

Arguments

text

A character vector to clean.

Value

A character vector the same length as the input text, with spaces added between ideographs.

Examples

to_space <- intToUtf8(13312:13320)
to_space
space_cjk(to_space)

piecemaker documentation built on June 7, 2023, 5:55 p.m.