seek | R Documentation |
These functions search through one or more text files, extract lines matching a regular expression pattern, and return a tibble containing the results.
seek()
: Discovers files inside one or more directories (recursively or not),
applies optional file name and text file filtering, and searches lines.
seek_in()
: Searches inside a user-provided character vector of files.
seek(
pattern,
path = ".",
...,
filter = NULL,
negate = FALSE,
recurse = FALSE,
all = FALSE,
relative_path = TRUE,
matches = FALSE
)
seek_in(files, pattern, ..., matches = FALSE)
pattern |
A regular expression pattern used to match lines. |
path |
A character vector of one or more directories where files should be
discovered (only for |
... |
Additional arguments passed to |
filter |
Optional. A regular expression pattern used to filter file paths
before reading. If |
negate |
Logical. If |
recurse |
If |
all |
If |
relative_path |
Logical. If TRUE, file paths are made relative to the path argument. If multiple root paths are provided, relative_path is automatically ignored and absolute paths are kept to avoid ambiguity. |
matches |
Logical. If |
files |
A character vector of files to search (only for |
The overall process involves the following steps:
File Selection
seek()
: Files are discovered using fs::dir_ls()
, starting from one or more directories.
seek_in()
: Files are directly supplied by the user (no discovery phase).
File Filtering
Files located inside .git/
folders are automatically excluded.
Files with known non-text extensions (e.g., .png
, .exe
, .rds
) are excluded.
If a file's extension is unknown, a check is performed to detect embedded null bytes (binary indicator).
Optionally, an additional regex-based path filter (filter
) can be applied.
Line Reading
Files are read line-by-line using readr::read_lines()
.
Only lines matching the provided regular expression pattern
are retained.
If a file cannot be read, it is skipped gracefully without failing the process.
Data Frame Construction
A tibble is constructed with one row per matched line.
These functions are particularly useful for analyzing source code, configuration files, logs, and other structured text data.
A tibble with one row per matched line, containing:
path
: File path (relative or absolute).
line_number
: Line number in the file.
match
: The first matched substring.
matches
: All matched substrings (if matches = TRUE
).
line
: Full content of the matching line.
fs::dir_ls()
, readr::read_lines()
, stringr::str_detect()
path = system.file("extdata", package = "seekr")
# Search all function definitions in R files
seek("[^\\s]+(?= (=|<-) function\\()", path, filter = "\\.R$")
# Search for usage of "TODO" comments in source code in a case insensitive way
seek("(?i)TODO", path, filter = "\\.R$")
# Search for error/warning in log files
seek("(?i)error", path, filter = "\\.log$")
# Search for config keys in YAML
seek("database:", path, filter = "\\.ya?ml$")
# Looking for "length" in all types of text files
seek("(?i)length", path)
# Search for specific CSV headers using seek_in() and reading only the first line
csv_files <- list.files(path, "\\.csv$", full.names = TRUE)
seek_in(csv_files, "(?i)specie", n_max = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.