to_html: Convert Control Sequences to HTML Equivalents

View source: R/tohtml.R

to_htmlR Documentation

Convert Control Sequences to HTML Equivalents

Description

Interprets CSI SGR sequences and OSC hyperlinks to produce strings with the state reproduced with SPAN elements, inline CSS styles, and A anchors. Optionally for colors, the SPAN elements may be assigned classes instead of inline styles, in which case it is the user's responsibility to provide a style sheet. Input that contains special HTML characters ("<", ">", "&", "'", and "\"") likely should be escaped with html_esc, and to_html will warn if it encounters the first two.

Usage

to_html(
  x,
  warn = getOption("fansi.warn", TRUE),
  term.cap = getOption("fansi.term.cap", dflt_term_cap()),
  classes = FALSE,
  carry = getOption("fansi.carry", TRUE)
)

Arguments

x

a character vector or object that can be coerced to such.

warn

TRUE (default) or FALSE, whether to warn when potentially problematic Control Sequences are encountered. These could cause the assumptions fansi makes about how strings are rendered on your display to be incorrect, for example by moving the cursor (see ?fansi). At most one warning will be issued per element in each input vector. Will also warn about some badly encoded UTF-8 strings, but a lack of UTF-8 warnings is not a guarantee of correct encoding (use validUTF8 for that).

term.cap

character a vector of the capabilities of the terminal, can be any combination of "bright" (SGR codes 90-97, 100-107), "256" (SGR codes starting with "38;5" or "48;5"), "truecolor" (SGR codes starting with "38;2" or "48;2"), and "all". "all" behaves as it does for the ctl parameter: "all" combined with any other value means all terminal capabilities except that one. fansi will warn if it encounters SGR codes that exceed the terminal capabilities specified (see term_cap_test for details). In versions prior to 1.0, fansi would also skip exceeding SGRs entirely instead of interpreting them. You may add the string "old" to any otherwise valid term.cap spec to restore the pre 1.0 behavior. "old" will not interact with "all" the way other valid values for this parameter do.

classes

FALSE (default), TRUE, or character vector of either 16, 32, or 512 class names. Character strings may only contain ASCII characters corresponding to letters, numbers, the hyphen, or the underscore. It is the user's responsibility to provide values that are legal class names.

  • FALSE: All colors rendered as inline CSS styles.

  • TRUE: Each of the 256 basic colors is mapped to a class in form "fansi-color-###" (or "fansi-bgcol-###" for background colors) where "###" is a zero padded three digit number in 0:255. Basic colors specified with SGR codes 30-37 (or 40-47) map to 000:007, and bright ones specified with 90-97 (or 100-107) map to 008:015. 8 bit colors specified with SGR codes 38;5;### or 48;5;### map directly based on the value of "###". Implicitly, this maps the 8 bit colors in 0:7 to the basic colors, and those in 8:15 to the bright ones even though these are not exactly the same when using inline styles. "truecolor"s specified with 38;2;#;#;# or 48;2;#;#;# do not map to classes and are rendered as inline styles.

  • character(16): The eight basic colors are mapped to the string values in the vector, all others are rendered as inline CSS styles. Basic colors are mapped irrespective of whether they are encoded as the basic colors or as 8-bit colors. Sixteen elements are needed because there must be eight classes for foreground colors, and eight classes for background colors. Classes should be ordered in ascending order of color number, with foreground and background classes alternating starting with foreground (see examples).

  • character(32): Like character(16), except the basic and bright colors are mapped.

  • character(512): Like character(16), except the basic, bright, and all other 8-bit colors are mapped.

carry

TRUE, FALSE (default), or a scalar string, controls whether to interpret the character vector as a "single document" (TRUE or string) or as independent elements (FALSE). In "single document" mode, active state at the end of an input element is considered active at the beginning of the next vector element, simulating what happens with a document with active state at the end of a line. If FALSE each vector element is interpreted as if there were no active state when it begins. If character, then the active state at the end of the carry string is carried into the first element of x (see "Replacement Functions" for differences there). The carried state is injected in the interstice between an imaginary zeroeth character and the first character of a vector element. See the "Position Semantics" section of substr_ctl and the "State Interactions" section of ?fansi for details. Except for strwrap_ctl where NA is treated as the string "NA", carry will cause NAs in inputs to propagate through the remaining vector elements.

Details

Only "observable" formats are translated. These include colors, background-colors, and basic styles (CSI SGR codes 1-6, 8, 9). Style 7, the "inverse" style, is implemented by explicitly switching foreground and background colors, if there are any. Styles 5-6 (blink) are rendered as "text-decoration" but likely will do nothing in the browser. Style 8 (conceal) sets the color to transparent.

Parameters in OSC sequences are not copied over as they might have different semantics in the OSC sequences than they would in HTML (e.g. the "id" parameter is intended to be non-unique in OSC).

Each element of the input vector is translated into a stand-alone valid HTML string. In particular, any open tags generated by fansi are closed at the end of an element and re-opened on the subsequent element with the same style. This allows safe combination of HTML translated strings, for example by pasteing them together. The trade-off is that there may be redundant HTML produced. To reduce redundancy you can first collapse the input vector into one string, being mindful that very large strings may exceed maximum string size when converted to HTML.

fansi-opened tags are closed and new ones open anytime the "observable" state changes. to_html never produces nested tags, even if at times that might produce more compact output. While it would be possible to match a CSI/OSC encoded state with nested tags, it would increase the complexity of the code substantially for little gain.

Value

A character vector of the same length as x with all escape sequences removed and any basic ANSI CSI SGR escape sequences applied via SPAN HTML tags.

Note

Non-ASCII strings are converted to and returned in UTF-8 encoding.

to_html always terminates as not doing so produces invalid HTML. If you wish for the last active SPAN to bleed into subsequent text you may do so with e.g. sub("</span>(?:</a>)?$", "", x) or similar. Additionally, unlike other functions, the default is carry = TRUE for compatibility with semantics of prior versions of fansi.

See Also

Other HTML functions: html_esc(), in_html(), make_styles()

Examples

to_html("hello\033[31;42;1mworld\033[m")
to_html("hello\033[31;42;1mworld\033[m", classes=TRUE)

## Input contains HTML special chars
x <- "<hello \033[42m'there' \033[34m &amp;\033[m \"moon\"!"
writeLines(x)
## Not run: 
in_html(
  c(
    to_html(html_esc(x)),  # Good
    to_html(x)             # Bad (warning)!
) )

## End(Not run)
## Generate some class names for basic colors
classes <- expand.grid(
  "myclass",
  c("fg", "bg"),
  c("black", "red", "green", "yellow", "blue", "magenta", "cyan", "white")
)
classes  # order is important!
classes <- do.call(paste, c(classes, sep="-"))
## We only provide 16 classes, so Only basic colors are
## mapped to classes; others styled inline.
to_html(
  "\033[94mhello\033[m \033[31;42;1mworld\033[m",
  classes=classes
)
## Create a whole web page with a style sheet for 256 colors and
## the colors shown in a table.
class.256 <- do.call(paste, c(expand.grid(c("fg", "bg"), 0:255), sep="-"))
sgr.256 <- sgr_256()     # A demo of all 256 colors
writeLines(sgr.256[1:8]) # SGR formatting

## Convert to HTML using classes instead of inline styles:
html.256 <- to_html(sgr.256, classes=class.256)
writeLines(html.256[1])  # No inline colors

## Generate different style sheets.  See `?make_styles` for details.
default <- make_styles(class.256)
mix <- matrix(c(.6,.2,.2, .2,.6,.2, .2,.2,.6), 3)
desaturated <- make_styles(class.256, mix)
writeLines(default[1:4])
writeLines(desaturated[1:4])

## Embed in HTML page and diplay; only CSS changing
## Not run: 
in_html(html.256)                  # no CSS
in_html(html.256, css=default)     # default CSS
in_html(html.256, css=desaturated) # desaturated CSS

## End(Not run)

fansi documentation built on Oct. 9, 2023, 1:07 a.m.