sanitizeZenkaku: Sanitizing strings contaminated with fullwidth (zenkaku)...

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Sanitizing strings unintensionally contaminated with fullwidth (zenkaku) charactors by converting characters from fullwidth (zenkaku) to halfwidth (hankaku) forms.

Usage

1

Arguments

s

A character vector. UTF-8 encoding is preferable.

Details

Occasionally a character vector is unintentionally contaminated with fullwidth (zenkaku) characters. sanitizeZenkaku remove Japanese fullwidth (zenkaku) alphabets, numbers, and symbols from the given character vector in order to make logical and factor vectors work properly. The alphabets, numbers, and symbols are substitute for halfwidth forms (aka. ASCII), while a fullwidth space is simply removed.

Value

A character vector. All alphabets, numbers, and symbols have their halfwidth from.

Author(s)

Susumu Tanimura aruminat@gmail.com

See Also

zen2han

Examples

1
2
(n <- intToUtf8(c(65296 + 1:3, 12288)))
sanitizeZenkaku(n)

Nippon documentation built on May 2, 2019, 1:03 p.m.