parse_url: Parse URLs with internationalized domain name handling
In punycoder: Unicode and Punycode Domain Name Processing

parse_url

R Documentation

Parse URLs with internationalized domain name handling

Description

Parses URLs and returns a structured list with proper handling of internationalized domain names. This function provides both Unicode and ASCII representations of domain components.

Usage

parse_url(url, encode_domains = FALSE)

Arguments

`url`	Character vector of URLs to parse
`encode_domains`	Logical flag; encode parsed host names to ASCII.

Value

An object of class "punycoder_parsed_url" (a named list) with components:

scheme: Character vector of URL schemes (e.g., "https").
domain: Character vector of domain names.
port: Integer vector of port numbers.
path: Character vector of URL paths.
query: Character vector of query strings.
fragment: Character vector of fragment identifiers.

Each component has one element per input URL. Invalid URLs yield NA components. For valid URLs without an explicit path, path is returned as "".

Examples


# Parse URL with Unicode domain
parse_url(
  "https://caf\u00E9.example.com:8080/path?query=value#fragment"
)

# Parse multiple URLs
urls <- c(
  "https://caf\u00E9.com/menu",
  "https://\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444/info"
)
parse_url(urls)

punycoder documentation built on June 16, 2026, 9:07 a.m.