regex_subset_linter: Require usage of direct methods for subsetting strings via...
In r-lib/lintr: A 'Linter' for R Code

regex_subset_linter

R Documentation

Require usage of direct methods for subsetting strings via regex

Description

Using value = TRUE in grep() returns the subset of the input that matches the pattern, e.g. grep("[a-m]", letters, value = TRUE) will return the first 13 elements (a through m).

Usage

regex_subset_linter()

Details

letters[grep("[a-m]", letters)] and letters[grepl("[a-m]", letters)] both return the same thing, but more circuitously and more verbosely.

The stringr package also provides an even more readable alternative, namely str_subset(), which should be preferred to versions using str_detect() and str_which().

Exceptions

Note that x[grep(pattern, x)] and grep(pattern, x, value = TRUE) are not completely interchangeable when x is not character (most commonly, when x is a factor), because the output of the latter will be a character vector while the former remains a factor. It still may be preferable to refactor such code, as it may be faster to match the pattern on levels(x) and use that to subset instead.

Examples

# will produce lints
lint(
  text = "x[grep(pattern, x)]",
  linters = regex_subset_linter()
)

lint(
  text = "x[stringr::str_which(x, pattern)]",
  linters = regex_subset_linter()
)

# okay
lint(
  text = "grep(pattern, x, value = TRUE)",
  linters = regex_subset_linter()
)

lint(
  text = "stringr::str_subset(x, pattern)",
  linters = regex_subset_linter()
)

r-lib/lintr documentation built on June 9, 2025, 7:45 a.m.