puny_encode: Encode Unicode domains to ASCII punycode

View source: R/punycoder.R

puny_encodeR Documentation

Encode Unicode domains to ASCII punycode

Description

Converts Unicode domain names to their ASCII punycode representation following RFC 3492 standards. This function is essential for processing internationalized domain names (IDNs) in web scraping and URL analysis.

Usage

puny_encode(x, strict = getOption("punycoder.strict", TRUE))

Arguments

x

Character vector of Unicode domain names to encode

strict

Logical; whether to apply strict validation. Defaults to 'getOption("punycoder.strict", TRUE)'.

Value

A character vector the same length as x, with each element containing the ASCII punycode-encoded domain name. Elements corresponding to NA inputs are NA_character_. In non-strict mode, domains that fail encoding are also returned as NA_character_.

See Also

puny_decode for the reverse operation, url_encode for full URL encoding.

Examples


# Basic encoding
puny_encode("caf\u00E9.com")
puny_encode("\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444")

# Vectorized encoding
domains <- c(
  "caf\u00E9.com",
  "\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444",
  "\u5317\u4EAC.\u4E2D\u56FD"
)
puny_encode(domains)


punycoder documentation built on June 16, 2026, 9:07 a.m.