desc <- suppressWarnings(readLines("DESCRIPTION"))
regex <- "(^Version:\\s+)(\\d+\\.\\d+\\.\\d+)"
loc <- grep(regex, desc)
ver <- gsub(regex, "\\2", desc[loc])
verbadge <- sprintf('<a href="https://img.shields.io/badge/Version-%s-orange.svg"><img src="https://img.shields.io/badge/Version-%s-orange.svg" alt="Version"/></a></p>', ver, ver)
````

```r
library(knitr)
knit_hooks$set(htmlcap = function(before, options, envir) {
  if(!before) {
    paste('<p class="caption"><b><em>',options$htmlcap,"</em></b></p>",sep="")
    }
    })
library(pathr)
knitr::opts_knit$set(self.contained = TRUE, cache = FALSE)
knitr::opts_chunk$set(fig.path = "tools/figure/")
curwd <- getwd()
parsedwd <- parse_path(curwd)

Project Status: Active - The project has reached a stable, usable state and is being actively developed. Build Status Coverage Status r verbadge

pathr is a collection of tools to extract, examine, and reconfigure elements of file paths. The package is born out of a frustration with finding the right base R tools to grab certain parts of a path. Often these functions are located in packages that are base install but not loaded by default (e.g., tools::file_ext). Additionally, many names of path manipulation functions in base R are longer and thus often difficult to remember and require more time to type. Still, other path manipulation tasks had me building my own custom manipulation tools via strsplit and file.path. pathr is designed to be a consistent set of tools that allow the user to solve most path related needs simply by remembering 7 basic sets of parsing and manipulation tools (the first seven rows in the table of function usage found in the Function Usage section). The package is designed to be pipeable (easily used within a magrittr/pipeR pipeline) but is not required.

Functions

Functions typically fall into the task category of (1) parsing, (2) manipulating, (3) examining, & (4) action. The main functions, task category, & descriptions are summarized in the table below:

| Function | Task | Description | |---------------------------|------------|-----------------------------------------------------------| | parse_path | parsing | Parse path into elements (sub-directories & files) | | front/back | manipulate | Get first/last n elements of a path | | index | manipulate | Get indexed elements of a path | | before/after | manipulate | Get n elements before/after a regex occurrence | | swap/swap_index/swap_regex | manipulate | Replace elements of a path | | file_path | manipulate | Combine file paths | | expand_path | manipulate | Expand tilde prefixed file path | | normalize | manipulate | Make all path separators forward slashes | | win_fix | manipulate | Replace single backslash with a forward slash | | file_ext/no_file_ext | manipulate | Get/remove file extensions | | tree | examine | View path structure as an ASCII style tree (experimental) | | indent_path | examine | View path hierarchy as an indented list | | copy_path | action | Copy path(s) to clipboard | | open_path | action | Open path(s) (directories & file) |

Installation

To download the development version of pathr:

Download the zip ball or tar ball, decompress and run R CMD INSTALL on it, or use the pacman package to install the development version:

if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("trinker/pathr")

Contact

You are welcome to: submit suggestions and bug-reports at: https://github.com/trinker/pathr/issues send a pull request on: https://github.com/trinker/pathr/ * compose a friendly e-mail to: tyler.rinker@gmail.com

Demonstration

Load the Packages/Data

if (!require("pacman")) install.packages("pacman")

pacman::p_load(pathr, magrittr)

data(files)
set.seed(11); (myfiles <- sample(files, 10))

Parsing

The parse_path function simply splits an atomic vector of paths into a list of paths split on the slash separator. For example, my current working directory, r curwd, becomes:

getwd() %>%
    parse_path()

While this isn't earth shattering it allows the pathr manipulation functions to extract, replace, and recombine parts of the path elements into a sub-path. Here I use path to mean the original path, r curwd. A path is simply a slash separated mapping of the location of a file or directory within a hierarchical order of sub-directories. These sub-directories are the elements of the path. The final output from one of the manipulation functions is a sub-path of the original at most the same number of elements as the original.

In this example I parse a multi-path vector:

myfiles %>%
    parse_path()

Manipulating

Once the path has been parsed the individual elements can be extracted and/or replaced to form sub-paths. In this section I break the manipulation functions into (1) extracting (2) replacing, (3) combining, and (4) expanding types. There are a few miscellaneous pathr functions that are not an extracting, replacing, combining, or expanding tool which will be discussed at the end of the Manipulating section.

Extracting

Extracting can replace path elements by their numeric index or by their content relative to a matched regular expression. There are three sets of extracting functions (1) front/back, (2) index, and (3) before/after. The first two rely on matching elements to their numeric position while the latter set uses extraction relative to a regular expression match.

Front & Back

The front/back set of functions works like head/tail (in fact these functions are used under the hood). The user can select the first n elements using front or the last n elements using back. These functions require that users want either the first or last element of the path in their sub-path.

Here I replicate base R's dirname & basename functions using the default settings of front & back as dirname is taking head(x, -1) elements (or all but the last element) and basename is taking tail(x, 1) (or the last element).

myfiles %>%
    parse_path() %>%
    front()

myfiles %>%
    dirname()
myfiles %>%
    parse_path() %>% 
    back()

myfiles %>%
    basename()

But the front/back set is more versatile still as demonstrated below:

myfiles %>%
    parse_path() %>%
    front(3)

myfiles %>%
    parse_path()  %>% 
    back(3)

Indices

The index function compliments the front/back set by allowing the user to select the middle elements of a path. Unlike front/back, the index function does not require elements to be sequential. The user will get a sub-path equal in length to the length of the inds argument.

myfiles %>%
    parse_path() %>%
    index(4)

myfiles %>%
    parse_path() %>%
    index(2:4)

myfiles %>%
    parse_path() %>%
    index(c(2, 4))

Before & After Regex

The before/after differ from the previous sets of manipulation functions in that it allows the user to select elements based on their content rather than numeric position. The user provides a regular expression to match against. All elements before or after this regex match will be selected for use in the sub-path. The user may include the regex matched element by setting include = TRUE.

Here I extract all elements after the element containing the regex "^qdap$".

myfiles %>%
    parse_path() %>%
    after("^qdap$")

The user can include the element that matched the regex as well using include = TRUE:

myfiles %>%
    parse_path() %>%
    after("^qdap$", include = TRUE)

Here I use before to extract all elements before the regex match to paths that contain an element with a file that ends in .R. If a path does not contain that element match NA is returned.

myfiles %>%
    parse_path() %>%
    before("\\.R$")

Replacing

Often the user will want to replace elements of a path with another. The swap function allows the user to match with a numeric index or a regular expression to determine the element locations to be replaced. The swap_index & swap_regex functions are less flexible than the more inclusive function but are also more explicit, transparent and pipeable. Preference is typically given to the later swap_xxx functions in chained usage.

swap

In this scenario I replace the root tilde with MyRoot:

myfiles %>%
    parse_path() %>%
    swap(1, "MyRoot")

swap_index & swap_regex

In the next use I replace qdap with textMining by referencing the third element:

myfiles %>%
    parse_path() %>%
    swap_index(3, "textMining")

When the element position is unknown swap_regex provides a means to replace elements:

myfiles %>%
    parse_path() %>%
    swap_regex("^qdap$", "textMining")

myfiles %>%
    parse_path() %>%
    swap_regex("\\.R$", "function.R")

Combining

While the above tools work to produce sub-paths with an equal or less length of elements file_path is a means to combine/construct file paths that may be greater in length than the original path/elements supplied. file_path is wrapper for base::file.path that uses the underscore naming convention and normalizes the separator to be a single forward slash.

file_path("root", "mydir", paste0("file", 1:2, ".pdf"))

This is especially useful when combined with extraction/replacement techniques to form new paths as shown below:

myfiles %>%
    parse_path()  %>% 
    after("R$", include = TRUE)  %>%
    na.omit() %>%
    file_path("Root", "newPackage", .)

Expanding

Like file_path above, expand_path produces a path that is longer than the input path. expand_path is wrapper for base::path.expand used to expand tilde prefixed paths by replacing the leading tilde with the user's home directory.

expand_path("~/mydir/subdir/myfile.pdf")

The user may have noticed that in the example above, demonstrating front's ability to mimic dirname, is incomplete. That is the outputs from front and dirname are not identical. This is because dirname, by default, expands the tilde in the example myfiles whereas front does not. Simply adding expand_path on the end of the chain replicates dirname exactly.

myfiles %>%
    parse_path() %>% 
    front() %>%
    expand_path()

Miscellaneous

As noted above, pathr contains a few functions that are not an extracting, replacing, combining, or extracting tool.

Normalizing

normalize replaces all path slash separators with an R friendly forward slash.

c("C:\\Users\\Tyler\\AppData\\Local\\Temp\\Rtmp2Ll9d9", "C:/R/R-3.2.2") %>%
    parse_path() %>%
    normalize()

Windows Fix (single backslashes)

win_fix reads an R unfriendly Windows path (single backslashes) and replaces with friendly forward slashes. This functionality can't be demonstrated within a knitr document because the single backslashes can't be parsed or copied to the clipboard from within R.

If the user were to copy the following path, ~\Packages\qdap\R\cm_code.overlap.R, to the clipboard and run win_fix() the result would be: "~/Packages/qdap/R/cm_code.overlap.R"

Examination

Functions of this class are designed to allow the user to examine structural aspects of a path and contents.

Tree

tree allows the user to see the hierarchical structure of a path's contents (all the sub-directories and files contained within a parent directory) as a tree. This function is (with default use.data.tree = FALSE) OS dependent, and requires that the tree program (tree for Windows or tree for Unix) be installed.

On my Windows system this is the tree structure for the pathr package as installed in R library:

file_path(.libPaths(), "pathr") %>%
    tree()

Users concerned with OS dependence can use a data.tree implementation of tree. This version is slower but is uniform and requires no outside dependencies to be installed. Additionally, the output is a data.tree "Node" class and can be manipulated accordingly.

file_path(.libPaths(), "pathr") %>%
    tree(use.data.tree = TRUE)

Indented Elements

indent_path on the other hand, works on individual paths (not contents) to visualize the hierarchical structure of a path's elements.

file_path(.libPaths(), "pathr/DESCRIPTION") %>%
    indent_path()

Action

The last type of functions in pathr use the operating system to do some sort of action.

Opening

open_path uses the operating system defaults to open directories and files. This function is operating system and setting dependent. Results may not be consistent across operating systems. Depending upon the default programs for file types the results may vary as well. Some files may not be able to be opened.

Opening Directories

open_path()

file_path(.libPaths(), "pathr") %>%
    open_path()

Opening Files

file_path(R.home(), "doc/html/about.html") %>%
    open_path()

Copying

copy_path uses the clipr package's write_clip function to write the current vector, x, to the clipboard but still returns x. This makes the copying pipeable, allowing the contents to be copied yet be passed along in the chain.

pacman::p_load(clipr)

R.home() %>%
    list.files(full.names = TRUE) %>%
    copy_path() %>%
    parse_path() %>%
    back(1) 

## What was copied to the clipboard
clipr::read_clip()


trinker/pathr documentation built on May 31, 2019, 9:41 p.m.