strtrim_ctl: Control Sequence Aware Version of strtrim

View source: R/strtrim.R

strtrim_ctlR Documentation

Control Sequence Aware Version of strtrim

Description

A drop in replacement for base::strtrim, with the difference that all C0 control characters such as newlines, carriage returns, etc., are always treated as zero width, whereas in base it may vary with platform / R version.

Usage

strtrim_ctl(
  x,
  width,
  warn = getOption("fansi.warn", TRUE),
  ctl = "all",
  normalize = getOption("fansi.normalize", FALSE),
  carry = getOption("fansi.carry", FALSE),
  terminate = getOption("fansi.terminate", TRUE)
)

strtrim2_ctl(
  x,
  width,
  warn = getOption("fansi.warn", TRUE),
  tabs.as.spaces = getOption("fansi.tabs.as.spaces", FALSE),
  tab.stops = getOption("fansi.tab.stops", 8L),
  ctl = "all",
  normalize = getOption("fansi.normalize", FALSE),
  carry = getOption("fansi.carry", FALSE),
  terminate = getOption("fansi.terminate", TRUE)
)

Arguments

x

a character vector, or an object which can be coerced to a character vector by as.character.

width

positive integer values: recycled to the length of x.

warn

TRUE (default) or FALSE, whether to warn when potentially problematic Control Sequences are encountered. These could cause the assumptions fansi makes about how strings are rendered on your display to be incorrect, for example by moving the cursor (see ?fansi). At most one warning will be issued per element in each input vector. Will also warn about some badly encoded UTF-8 strings, but a lack of UTF-8 warnings is not a guarantee of correct encoding (use validUTF8 for that).

ctl

character, which Control Sequences should be treated specially. Special treatment is context dependent, and may include detecting them and/or computing their display/character width as zero. For the SGR subset of the ANSI CSI sequences, and OSC hyperlinks, fansi will also parse, interpret, and reapply the sequences as needed. You can modify whether a Control Sequence is treated specially with the ctl parameter.

  • "nl": newlines.

  • "c0": all other "C0" control characters (i.e. 0x01-0x1f, 0x7F), except for newlines and the actual ESC (0x1B) character.

  • "sgr": ANSI CSI SGR sequences.

  • "csi": all non-SGR ANSI CSI sequences.

  • "url": OSC hyperlinks

  • "osc": all non-OSC-hyperlink OSC sequences.

  • "esc": all other escape sequences.

  • "all": all of the above, except when used in combination with any of the above, in which case it means "all but".

normalize

TRUE or FALSE (default) whether SGR sequence should be normalized out such that there is one distinct sequence for each SGR code. normalized strings will occupy more space (e.g. "\033[31;42m" becomes "\033[31m\033[42m"), but will work better with code that assumes each SGR code will be in its own escape as crayon does.

carry

TRUE, FALSE (default), or a scalar string, controls whether to interpret the character vector as a "single document" (TRUE or string) or as independent elements (FALSE). In "single document" mode, active state at the end of an input element is considered active at the beginning of the next vector element, simulating what happens with a document with active state at the end of a line. If FALSE each vector element is interpreted as if there were no active state when it begins. If character, then the active state at the end of the carry string is carried into the first element of x (see "Replacement Functions" for differences there). The carried state is injected in the interstice between an imaginary zeroeth character and the first character of a vector element. See the "Position Semantics" section of substr_ctl and the "State Interactions" section of ?fansi for details. Except for strwrap_ctl where NA is treated as the string "NA", carry will cause NAs in inputs to propagate through the remaining vector elements.

terminate

TRUE (default) or FALSE whether substrings should have active state closed to avoid it bleeding into other strings they may be prepended onto. This does not stop state from carrying if carry = TRUE. See the "State Interactions" section of ?fansi for details.

tabs.as.spaces

FALSE (default) or TRUE, whether to convert tabs to spaces. This can only be set to TRUE if strip.spaces is FALSE.

tab.stops

integer(1:n) indicating position of tab stops to use when converting tabs to spaces. If there are more tabs in a line than defined tab stops the last tab stop is re-used. For the purposes of applying tab stops, each input line is considered a line and the character count begins from the beginning of the input line.

Details

strtrim2_ctl adds the option of converting tabs to spaces before trimming. This is the only difference between strtrim_ctl and strtrim2_ctl.

Value

Like base::strtrim, except that Control Sequences are treated as zero width.

Note

Non-ASCII strings are converted to and returned in UTF-8 encoding. Width calculations will not work properly in R < 3.2.2.

See Also

?fansi for details on how Control Sequences are interpreted, particularly if you are getting unexpected results, normalize_state for more details on what the normalize parameter does, state_at_end to compute active state at the end of strings, close_state to compute the sequence required to close active state.

Examples

strtrim_ctl("\033[42mHello world\033[m", 6)

fansi documentation built on Oct. 9, 2023, 1:07 a.m.