parse_url: Parse URLs with internationalized domain name handling

View source: R/url-utils.R

parse_urlR Documentation

Parse URLs with internationalized domain name handling

Description

Parses URLs and returns a structured list with proper handling of internationalized domain names. This function provides both Unicode and ASCII representations of domain components.

Usage

parse_url(url, encode_domains = FALSE)

Arguments

url

Character vector of URLs to parse

encode_domains

Logical flag; encode parsed host names to ASCII.

Value

An object of class "punycoder_parsed_url" (a named list) with components:

scheme

Character vector of URL schemes (e.g., "https").

domain

Character vector of domain names.

port

Integer vector of port numbers.

path

Character vector of URL paths.

query

Character vector of query strings.

fragment

Character vector of fragment identifiers.

Each component has one element per input URL. Invalid URLs yield NA components. For valid URLs without an explicit path, path is returned as "".

See Also

url_encode, url_decode for URL transformation with IDN handling.

Examples


# Parse URL with Unicode domain
parse_url(
  "https://caf\u00E9.example.com:8080/path?query=value#fragment"
)

# Parse multiple URLs
urls <- c(
  "https://caf\u00E9.com/menu",
  "https://\u043C\u043E\u0441\u043A\u0432\u0430.\u0440\u0444/info"
)
parse_url(urls)


punycoder documentation built on June 16, 2026, 9:07 a.m.