desc <- suppressWarnings(readLines("DESCRIPTION")) regex <- "(^Version:\\s+)(\\d+\\.\\d+\\.\\d+)" loc <- grep(regex, desc) ver <- gsub(regex, "\\2", desc[loc]) verbadge <- sprintf('<a href="https://img.shields.io/badge/Version-%s-orange.svg"><img src="https://img.shields.io/badge/Version-%s-orange.svg" alt="Version"/></a></p>', ver, ver) ```` ```r library(knitr) knit_hooks$set(htmlcap = function(before, options, envir) { if(!before) { paste('<p class="caption"><b><em>',options$htmlcap,"</em></b></p>",sep="") } }) library(pathr) knitr::opts_knit$set(self.contained = TRUE, cache = FALSE) knitr::opts_chunk$set(fig.path = "tools/figure/") curwd <- getwd() parsedwd <- parse_path(curwd)
pathr is a collection of tools to extract, examine, and reconfigure elements of file paths. The package is born out of a frustration with finding the right base R tools to grab certain parts of a path. Often these functions are located in packages that are base install but not loaded by default (e.g., tools::file_ext
). Additionally, many names of path manipulation functions in base R are longer and thus often difficult to remember and require more time to type. Still, other path manipulation tasks had me building my own custom manipulation tools via strsplit
and file.path
. pathr is designed to be a consistent set of tools that allow the user to solve most path related needs simply by remembering 7 basic sets of parsing and manipulation tools (the first seven rows in the table of function usage found in the Function Usage section). The package is designed to be pipeable (easily used within a magrittr/pipeR pipeline) but is not required.
Functions typically fall into the task category of (1) parsing, (2) manipulating, (3) examining, & (4) action. The main functions, task category, & descriptions are summarized in the table below:
| Function | Task | Description |
|---------------------------|------------|-----------------------------------------------------------|
| parse_path
| parsing | Parse path into elements (sub-directories & files) |
| front
/back
| manipulate | Get first/last n elements of a path |
| index
| manipulate | Get indexed elements of a path |
| before
/after
| manipulate | Get n elements before/after a regex occurrence |
| swap
/swap_index
/swap_regex
| manipulate | Replace elements of a path |
| file_path
| manipulate | Combine file paths |
| expand_path
| manipulate | Expand tilde prefixed file path |
| normalize
| manipulate | Make all path separators forward slashes |
| win_fix
| manipulate | Replace single backslash with a forward slash |
| file_ext
/no_file_ext
| manipulate | Get/remove file extensions |
| tree
| examine | View path structure as an ASCII style tree (experimental) |
| indent_path
| examine | View path hierarchy as an indented list |
| copy_path
| action | Copy path(s) to clipboard |
| open_path
| action | Open path(s) (directories & file) |
To download the development version of pathr:
Download the zip ball or tar ball, decompress and run R CMD INSTALL
on it, or use the pacman package to install the development version:
if (!require("pacman")) install.packages("pacman") pacman::p_load_gh("trinker/pathr")
You are welcome to: submit suggestions and bug-reports at: https://github.com/trinker/pathr/issues send a pull request on: https://github.com/trinker/pathr/ * compose a friendly e-mail to: tyler.rinker@gmail.com
if (!require("pacman")) install.packages("pacman") pacman::p_load(pathr, magrittr) data(files) set.seed(11); (myfiles <- sample(files, 10))
The parse_path
function simply splits an atomic vector of paths into a list of paths split on the slash separator. For example, my current working directory, r curwd
, becomes:
getwd() %>% parse_path()
While this isn't earth shattering it allows the pathr manipulation functions to extract, replace, and recombine parts of the path elements into a sub-path. Here I use path to mean the original path, r curwd
. A path is simply a slash separated mapping of the location of a file or directory within a hierarchical order of sub-directories. These sub-directories are the elements of the path. The final output from one of the manipulation functions is a sub-path of the original at most the same number of elements as the original.
In this example I parse a multi-path vector:
myfiles %>% parse_path()
Once the path has been parsed the individual elements can be extracted and/or replaced to form sub-paths. In this section I break the manipulation functions into (1) extracting (2) replacing, (3) combining, and (4) expanding types. There are a few miscellaneous pathr functions that are not an extracting, replacing, combining, or expanding tool which will be discussed at the end of the Manipulating section.
Extracting can replace path elements by their numeric index or by their content relative to a matched regular expression. There are three sets of extracting functions (1) front
/back
, (2) index
, and (3) before
/after
. The first two rely on matching elements to their numeric position while the latter set uses extraction relative to a regular expression match.
The front
/back
set of functions works like head
/tail
(in fact these functions are used under the hood). The user can select the first n
elements using front
or the last n
elements using back
. These functions require that users want either the first or last element of the path in their sub-path.
Here I replicate base R's dirname
& basename
functions using the default settings of front
& back
as dirname
is taking head(x, -1)
elements (or all but the last element) and basename
is taking tail(x, 1)
(or the last element).
myfiles %>% parse_path() %>% front() myfiles %>% dirname()
myfiles %>% parse_path() %>% back() myfiles %>% basename()
But the front
/back
set is more versatile still as demonstrated below:
myfiles %>% parse_path() %>% front(3) myfiles %>% parse_path() %>% back(3)
The index
function compliments the front
/back
set by allowing the user to select the middle elements of a path. Unlike front
/back
, the index
function does not require elements to be sequential. The user will get a sub-path equal in length to the length of the inds
argument.
myfiles %>% parse_path() %>% index(4) myfiles %>% parse_path() %>% index(2:4) myfiles %>% parse_path() %>% index(c(2, 4))
The before
/after
differ from the previous sets of manipulation functions in that it allows the user to select elements based on their content rather than numeric position. The user provides a regular expression to match against. All elements before
or after
this regex match will be selected for use in the sub-path. The user may include the regex matched element by setting include = TRUE
.
Here I extract all elements after the element containing the regex "^qdap$"
.
myfiles %>% parse_path() %>% after("^qdap$")
The user can include the element that matched the regex as well using include = TRUE
:
myfiles %>% parse_path() %>% after("^qdap$", include = TRUE)
Here I use before
to extract all elements before the regex match to paths that contain an element with a file that ends in .R
. If a path does not contain that element match NA
is returned.
myfiles %>% parse_path() %>% before("\\.R$")
Often the user will want to replace elements of a path with another. The swap
function allows the user to match with a numeric index or a regular expression to determine the element locations to be replaced. The swap_index
& swap_regex
functions are less flexible than the more inclusive function but are also more explicit, transparent and pipeable. Preference is typically given to the later swap_xxx
functions in chained usage.
swap
In this scenario I replace the root tilde with MyRoot
:
myfiles %>% parse_path() %>% swap(1, "MyRoot")
swap_index
& swap_regex
In the next use I replace qdap
with textMining
by referencing the third element:
myfiles %>% parse_path() %>% swap_index(3, "textMining")
When the element position is unknown swap_regex
provides a means to replace elements:
myfiles %>% parse_path() %>% swap_regex("^qdap$", "textMining") myfiles %>% parse_path() %>% swap_regex("\\.R$", "function.R")
While the above tools work to produce sub-paths with an equal or less length of elements file_path
is a means to combine/construct file paths that may be greater in length than the original path/elements supplied. file_path
is wrapper for base::file.path
that uses the underscore naming convention and normalizes the separator to be a single forward slash.
file_path("root", "mydir", paste0("file", 1:2, ".pdf"))
This is especially useful when combined with extraction/replacement techniques to form new paths as shown below:
myfiles %>% parse_path() %>% after("R$", include = TRUE) %>% na.omit() %>% file_path("Root", "newPackage", .)
Like file_path
above, expand_path
produces a path that is longer than the input path. expand_path
is wrapper for base::path.expand
used to expand tilde prefixed paths by replacing the leading tilde with the user's home directory.
expand_path("~/mydir/subdir/myfile.pdf")
The user may have noticed that in the example above, demonstrating front
's ability to mimic dirname
, is incomplete. That is the outputs from front
and dirname
are not identical. This is because dirname
, by default, expands the tilde in the example myfiles
whereas front
does not. Simply adding expand_path
on the end of the chain replicates dirname
exactly.
myfiles %>% parse_path() %>% front() %>% expand_path()
As noted above, pathr contains a few functions that are not an extracting, replacing, combining, or extracting tool.
normalize
replaces all path slash separators with an R friendly forward slash.
c("C:\\Users\\Tyler\\AppData\\Local\\Temp\\Rtmp2Ll9d9", "C:/R/R-3.2.2") %>% parse_path() %>% normalize()
win_fix
reads an R unfriendly Windows path (single backslashes) and replaces with friendly forward slashes. This functionality can't be demonstrated within a knitr document because the single backslashes can't be parsed or copied to the clipboard from within R.
If the user were to copy the following path, ~\Packages\qdap\R\cm_code.overlap.R
, to the clipboard and run win_fix()
the result would be: "~/Packages/qdap/R/cm_code.overlap.R"
Functions of this class are designed to allow the user to examine structural aspects of a path and contents.
tree
allows the user to see the hierarchical structure of a path's contents (all the sub-directories and files contained within a parent directory) as a tree. This function is (with default use.data.tree = FALSE
) OS dependent, and requires that the tree program (tree for Windows or tree for Unix) be installed.
On my Windows system this is the tree structure for the pathr package as installed in R library:
file_path(.libPaths(), "pathr") %>% tree()
Users concerned with OS dependence can use a data.tree implementation of tree
. This version is slower but is uniform and requires no outside dependencies to be installed. Additionally, the output is a data.tree "Node"
class and can be manipulated accordingly.
file_path(.libPaths(), "pathr") %>% tree(use.data.tree = TRUE)
indent_path
on the other hand, works on individual paths (not contents) to visualize the hierarchical structure of a path's elements.
file_path(.libPaths(), "pathr/DESCRIPTION") %>% indent_path()
The last type of functions in pathr use the operating system to do some sort of action.
open_path
uses the operating system defaults to open directories and files. This function is operating system and setting dependent. Results may not be consistent across operating systems. Depending upon the default programs for file types the results may vary as well. Some files may not be able to be opened.
open_path() file_path(.libPaths(), "pathr") %>% open_path()
file_path(R.home(), "doc/html/about.html") %>% open_path()
copy_path
uses the clipr package's write_clip
function to write the current vector, x
, to the clipboard but still returns x
. This makes the copying pipeable, allowing the contents to be copied yet be passed along in the chain.
pacman::p_load(clipr) R.home() %>% list.files(full.names = TRUE) %>% copy_path() %>% parse_path() %>% back(1) ## What was copied to the clipboard clipr::read_clip()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.