Nothing
## Copyright (C) Brodie Gaslam
##
## This file is part of "fansi - ANSI Control Sequence Aware String Functions"
##
## This program is free software: you can redistribute it and/or modify
## it under the terms of the GNU General Public License as published by
## the Free Software Foundation, either version 2 or 3 of the License.
##
## This program is distributed in the hope that it will be useful,
## but WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
## GNU General Public License for more details.
##
## Go to <https://www.r-project.org/Licenses> for copies of the licenses.
#' Details About Manipulation of Strings Containing Control Sequences
#'
#' Counterparts to R string manipulation functions that account for
#' the effects of some ANSI X3.64 (a.k.a. ECMA-48, ISO-6429) control sequences.
#'
#' @section Control Characters and Sequences:
#'
#' Control characters and sequences are non-printing inline characters or
#' sequences initiated by them that can be used to modify terminal display and
#' behavior, for example by changing text color or cursor position.
#'
#' We will refer to X3.64/ECMA-48/ISO-6429 control characters and sequences as
#' "_Control Sequences_" hereafter.
#'
#' There are four types of _Control Sequences_ that `fansi` can treat
#' specially:
#'
#' * "C0" control characters, such as tabs and carriage returns (we include
#' delete in this set, even though technically it is not part of it).
#' * Sequences starting in "ESC[", also known as Control Sequence
#' Introducer (CSI) sequences, of which the Select Graphic Rendition (SGR)
#' sequences used to format terminal output are a subset.
#' * Sequences starting in "ESC]", also known as Operating System
#' Commands (OSC), of which the subset beginning with "8" is used to encode
#' URI based hyperlinks.
#' * Sequences starting in "ESC" and followed by something other than "[" or
#' "]".
#'
#' _Control Sequences_ starting with ESC are assumed to be two characters
#' long (including the ESC) unless they are of the CSI or OSC variety, in which
#' case their length is computed as per the [ECMA-48
#' specification](https://ecma-international.org/publications-and-standards/standards/ecma-48/),
#' with the exception that [OSC hyperlinks](#osc-hyperlinks) may be terminated
#' with BEL ("\\a") in addition to ST ("ESC\\"). `fansi` handles most common
#' _Control Sequences_ in its parsing algorithms, but it is not a conforming
#' implementation of ECMA-48. For example, there are non-CSI/OSC escape
#' sequences that may be longer than two characters, but `fansi` will
#' (incorrectly) treat them as if they were two characters long. There are many
#' more unimplemented ECMA-48 specifications.
#'
#' In theory it is possible to encode CSI sequences with a single byte
#' introducing character in the 0x40-0x5F range instead of the traditional
#' "ESC[". Since this is rare and it conflicts with UTF-8 encoding, `fansi`
#' does not support it.
#'
#' Within _Control Sequences_, `fansi` further distinguishes CSI SGR and OSC
#' hyperlinks by recording format specification and URIs into string state, and
#' applying the same to any output strings according to the semantics of the
#' functions in use. CSI SGR and OSC hyperlinks are known together as _Special
#' Sequences_. See the following sections for details.
#'
#' Additionally, all _Control Sequences_, whether special or not,
#' do not count as characters, graphemes, or display width. You can cause
#' `fansi` to treat particular _Control Sequences_ as regular characters with
#' the `ctl` parameter.
#'
#' @section CSI SGR Control Sequences:
#'
#' **NOTE**: not all displays support CSI SGR sequences; run
#' [`term_cap_test`] to see whether your display supports them.
#'
#' CSI SGR Control Sequences are the subset of CSI sequences that can be
#' used to change text appearance (e.g. color). These sequences begin with
#' "ESC[" and end in "m". `fansi` interprets these sequences and writes new
#' ones to the output strings in such a way that the original formatting is
#' preserved. In most cases this should be transparent to the user.
#'
#' Occasionally there may be mismatches between how `fansi` and a display
#' interpret the CSI SGR sequences, which may produce display artifacts. The
#' most likely source of artifacts are _Control Sequences_ that move
#' the cursor or change the display, or that `fansi` otherwise fails to
#' interpret, such as:
#'
#' * Unknown SGR substrings.
#' * "C0" control characters like tabs and carriage returns.
#' * Other escape sequences.
#'
#' Another possible source of problems is that different displays parse
#' and interpret control sequences differently. The common CSI SGR sequences
#' that you are likely to encounter in formatted text tend to be treated
#' consistently, but less common ones are not. `fansi` tries to hew by the
#' ECMA-48 specification **for CSI SGR control sequences**, but not all
#' terminals do.
#'
#' The most likely source of problems will be 24-bit CSI SGR sequences.
#' For example, a 24-bit color sequence such as "ESC[38;2;31;42;4" is a
#' single foreground color to a terminal that supports it, or separate
#' foreground, background, faint, and underline specifications for one that does
#' not. `fansi` will always interpret the sequences according to ECMA-48, but
#' it will warn you if encountered sequences exceed those specified by
#' the `term.cap` parameter or the "fansi.term.cap" global option.
#'
#' `fansi` will will also warn if it encounters _Control Sequences_ that it
#' cannot interpret. You can turn off warnings via the `warn` parameter, which
#' can be set globally via the "fansi.warn" option. You can work around "C0"
#' tabs characters by turning them into spaces first with [`tabs_as_spaces`] or
#' with the `tabs.as.spaces` parameter available in some of the `fansi`
#' functions
#'
#' `fansi` interprets CSI SGR sequences in cumulative "Graphic Rendition
#' Combination Mode". This means new SGR sequences add to rather than replace
#' previous ones, although in some cases the effect is the same as replacement
#' (e.g. if you have a color active and pick another one).
#'
#' @section OSC Hyperlinks:
#'
#' Operating System Commands are interpreted by terminal emulators typically to
#' engage actions external to the display of text proper, such as setting a
#' window title or changing the active color palette.
#'
#' [Some terminals](https://iterm2.com/documentation-escape-codes.html) have
#' added support for associating URIs to text with OSCs in a similar way to
#' anchors in HTML, so `fansi` interprets them and outputs or terminates them as
#' needed. For example:
#'
#' ```
#' "\033]8;;xy.z\033\\LINK\033]8;;\033\\"
#' ```
#'
#' Might be interpreted as link to the URI "x.z". To make the encoding pattern
#' clearer, we replace "\033]" with "<OSC>" and "\033\\\\" with
#' "<ST>" below:
#'
#' ```
#' <OSC>8;;URI<ST>LINK TEXT<OSC>8;;<ST>
#' ```
#'
#' @section State Interactions:
#'
#' The cumulative nature of state as specified by SGR or OSC hyperlinks means
#' that unterminated strings that are spliced will interact with each other.
#' By extension, a substring does not inherently contain all the information
#' required to recreate its state as it appeared in the source document. The
#' default `fansi` configuration terminates extracted substrings and prepends
#' original state to them so they present on a stand-alone basis as they did as
#' part of the original string.
#'
#' To allow state in substrings to affect subsequent strings set `terminate =
#' FALSE`, but you will need to manually terminate them or deal with the
#' consequences of not doing so (see "Terminal Quirks").
#'
#' By default, `fansi` assumes that each element in an input character vector is
#' independent, but this is incorrect if the input is a single document with
#' each element a line in it. In that situation state from each line should
#' bleed into subsequent ones. Setting `carry = TRUE` enables the "single
#' document" interpretation.
#'
#' To most closely approximate what `writeLines(x)` produces on your terminal,
#' where `x` is a stateful string, use `writeLines(fansi_fun(x, carry=TRUE,
#' terminate=FALSE))`. `fansi_fun` is a stand-in for any of the `fansi` string
#' manipulation functions. Note that even with a seeming "null-op" such as
#' `substr_ctl(x, 1, nchar_ctl(x), carry=TRUE, terminate=FALSE)` the output
#' control sequences may not match the input ones, but the output _should_ look
#' the same if displayed to the terminal.
#'
#' `fansi` strings will be affected by any active state in strings they are
#' appended to. There are no parameters to control what happens in this case,
#' but `fansi` provides functions that can help the user get the desired
#' behavior. `state_at_end` computes the active state the end of a string,
#' which can then be prepended onto the _input_ of `fansi` functions so that
#' they are aware of the active style at the beginning of the string.
#' Alternatively, one could use `close_state(state_at_end(...))` and pre-pend
#' that to the _output_ of `fansi` functions so they are unaffected by preceding
#' SGR. One could also just prepend "ESC[0m", but in some cases as
#' described in [`?normalize_state`][normalize_state] that is sub-optimal.
#'
#' If you intend to combine stateful `fansi` manipulated strings with your own,
#' it may be best to set `normalize = TRUE` for improved compatibility (see
#' [`?normalize_state`][normalize_state].)
#'
#' @section Terminal Quirks:
#'
#' Some terminals (e.g. OS X terminal, ITerm2) will pre-paint the entirety of a
#' new line with the currently active background before writing the contents of
#' the line. If there is a non-default active background color, any unwritten
#' columns in the new line will keep the prior background color even if the new
#' line changes the background color. To avoid this be sure to use `terminate =
#' TRUE` or to manually terminate each line with e.g. "ESC[0m". The
#' problem manifests as:
#'
#' ```
#' " " = default background
#' "#" = new background
#' ">" = start new background
#' "!" = restore default background
#'
#' +-----------+
#' | abc\n |
#' |>###\n |
#' |!abc\n#####| <- trailing "#" after newline are from pre-paint
#' | abc |
#' +-----------+
#' ```
#'
#' The simplest way to avoid this problem is to split input strings by any
#' newlines they contain, and use `terminate = TRUE` (the default). A more
#' complex solution is to pad with spaces to the terminal window width before
#' emitting the newline to ensure the pre-paint is overpainted with the current
#' line's prevailing background color.
#'
#' @section Encodings / UTF-8:
#'
#' `fansi` will convert any non-ASCII strings to UTF-8 before processing them,
#' and `fansi` functions that return strings will return them encoded in UTF-8.
#' In some cases this will be different to what base R does. For example,
#' `substr` re-encodes substrings to their original encoding.
#'
#' Interpretation of UTF-8 strings is intended to be consistent with base R.
#' There are three ways things may not work out exactly as desired:
#'
#' 1. `fansi`, despite its best intentions, handles a UTF-8 sequence differently
#' to the way R does.
#' 2. R incorrectly handles a UTF-8 sequence.
#' 3. Your display incorrectly handles a UTF-8 sequence.
#'
#' These issues are most likely to occur with invalid UTF-8 sequences,
#' combining character sequences, and emoji. For example, whether special
#' characters such as emoji are considered one or two wide evolves as software
#' implements newer versions the Unicode databases.
#'
#' Internally, `fansi` computes the width of most UTF-8 character sequences
#' outside of the ASCII range using the native `R_nchar` function. This will
#' cause such characters to be processed slower than ASCII characters. Unlike R
#' (at least as of version 4.1), `fansi` can account for graphemes.
#'
#' Because `fansi` implements its own internal UTF-8 parsing it is possible
#' that you will see results different from those that R produces even on
#' strings without _Control Sequences_.
#'
#' @section Overflow:
#'
#' The maximum length of input character vector elements allowed by `fansi` is
#' the 32 bit INT_MAX, excluding the terminating NULL. As of R4.1 this is the
#' limit for R character vector elements generally, but is enforced at the C
#' level by `fansi` nonetheless.
#'
#' It is possible that during processing strings that are shorter than INT_MAX
#' would become longer than that. `fansi` checks for that overflow and will
#' stop with an error if that happens. A work-around for this situation is to
#' break up large strings into smaller ones. The limit is on each element of a
#' character vector, not on the vector as a whole. `fansi` will also error on
#' your system if `R_len_t`, the R type used to measure string lengths, is less
#' than the processed length of the string.
#'
#' @section R < 3.2.2 support:
#'
#' Nominally you can build and run this package in R versions between 3.1.0 and
#' 3.2.1. Things should mostly work, but please be aware we do not run the test
#' suite under versions of R less than 3.2.2. One key degraded capability is
#' width computation of wide-display characters. Under R < 3.2.2 `fansi` will
#' assume every character is 1 display width. Additionally, `fansi` may not
#' always report malformed UTF-8 sequences as it usually does. One
#' exception to this is [`nchar_ctl`] as that is just a thin wrapper around
#' [`base::nchar`].
#'
#' @useDynLib fansi, .registration=TRUE, .fixes="FANSI_"
#' @docType package
#' @aliases fansi-package
#' @name fansi
NULL
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.