Home

/

CRAN

/

utf8

/

utf8_normalize: Text Normalization

utf8_normalize: Text Normalization
In utf8: Unicode Text Processing

View source: R/utf8.R

utf8_normalize

R Documentation

Text Normalization

Description

Transform text to normalized form, optionally mapping to lowercase and applying compatibility maps.

Usage

utf8_normalize(
  x,
  ...,
  map_case = FALSE,
  map_compat = FALSE,
  map_quote = FALSE,
  remove_ignorable = FALSE
)

Arguments

`x`	character object.
`...`	These dots are for future extensions and must be empty.
`map_case`	a logical value indicating whether to apply Unicode case mapping to the text. For most languages, this transformation changes uppercase characters to their lowercase equivalents.
`map_compat`	a logical value indicating whether to apply Unicode compatibility mappings to the characters, those required for NFKC and NFKD normal forms.
`map_quote`	a logical value indicating whether to replace curly single quotes and Unicode apostrophe characters with ASCII apostrophe (U+0027).
`remove_ignorable`	a logical value indicating whether to remove Unicode "default ignorable" characters like zero-width spaces and soft hyphens.

Details

utf8_normalize() converts the elements of a character object to Unicode normalized composed form (NFC) while applying the character maps specified by the map_case, map_compat, map_quote, and remove_ignorable arguments.

Value

The result is a character object with the same attributes as x but with Encoding set to "UTF-8".

Examples


angstrom <- c("\u00c5", "\u0041\u030a", "\u212b")
utf8_normalize(angstrom) == "\u00c5"

utf8 documentation built on June 8, 2025, 9:31 p.m.

utf8 index

Package overview README.md Unicode: Emoji, accents, and international text

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

utf8
Unicode Text Processing

utf8_normalize: Text Normalization
In utf8: Unicode Text Processing

Text Normalization

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to utf8_normalize in utf8...

R Package Documentation

Browse R Packages

We want your feedback!

utf8 Unicode Text Processing

utf8_normalize: Text Normalization In utf8: Unicode Text Processing

Text Normalization

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to utf8_normalize in utf8...

R Package Documentation

Browse R Packages

We want your feedback!

utf8
Unicode Text Processing

utf8_normalize: Text Normalization
In utf8: Unicode Text Processing