| puny_encode | R Documentation |
Converts Unicode domain names to their ASCII punycode representation following RFC 3492 standards. This function is essential for processing internationalized domain names (IDNs) in web scraping and URL analysis.
puny_encode(x, strict = getOption("punycoder.strict", TRUE))
x |
Character vector of Unicode domain names to encode |
strict |
Logical; whether to apply strict validation. Defaults to 'getOption("punycoder.strict", TRUE)'. |
A character vector the same length as x, with each element
containing the ASCII punycode-encoded domain name. Elements corresponding
to NA inputs are NA_character_. In non-strict mode, domains
that fail encoding are also returned as NA_character_.
puny_decode for the reverse operation,
url_encode for full URL encoding.
# Basic encoding
puny_encode("caf\u00E9.com")
puny_encode("\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444")
# Vectorized encoding
domains <- c(
"caf\u00E9.com",
"\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444",
"\u5317\u4EAC.\u4E2D\u56FD"
)
puny_encode(domains)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.