desc <- suppressWarnings(readLines("DESCRIPTION")) regex <- "(^Version:\\s+)(\\d+\\.\\d+\\.\\d+)" loc <- grep(regex, desc) ver <- gsub(regex, "\\2", desc[loc]) verbad <- sprintf('<a href="https://img.shields.io/badge/Version-%s-orange.svg"><img src="https://img.shields.io/badge/Version-%s-orange.svg" alt="Version"/></a></p>', ver, ver) ```` [](https://travis-ci.org/trinker/regexr) [](https://coveralls.io/r/trinker/regexr) [](http://dx.doi.org/10.5281/zenodo.13496) `r verbad` <img src="inst/regexr_logo/r_regexr.png" alt="regexr logo"> > One of the most powerful tools in writing maintainable code is break large methods into well-named smaller methods - a technique Kent Beck refers to as the Composed Method pattern. [-Martin Fowler-](http://martinfowler.com/bliki/ComposedRegex.html) [regexr](http://trinker.github.com/regexr_dev) is an R framework for constructing and managing human readable regular expressions. It aims to provide tools that enable the user to write regular expressions in a way that is similar to the ways R code is written. The tools allow the user to: 1. Write in smaller, modular, named, *sub-expressions* 2. Write top to bottom, rather than a single string 3. Comment individual chunks 4. Indent expressions to represent regular expression groups 5. Add vertical line spaces and R comments (i.e., `#`) 6. Test the validity of the *concatenated expression* and the modular *sub-expressions* This framework harnesses the power and flexibility of regular expressions but provides a structural frame that is more consistent with both code writing and natural language conventions. The user decides how to break, indent, name, and comment the sub-expressions in a way that is human readable, meaningful, and modular. ## Installation To download the development version of regexr: Download the [zip ball](https://github.com/trinker/regexr/zipball/master) or [tar ball](https://github.com/trinker/regexr/tarball/master), decompress and run `R CMD INSTALL` on it, or use the **pacman** package to install the development version: ```r if (!require("pacman")) install.packages("pacman") pacman::p_load_gh("trinker/regexr")
You are welcome to: submit suggestions and bug-reports at: https://github.com/trinker/regexr/issues send a pull request on: https://github.com/trinker/regexr/ * compose a friendly e-mail to: tyler.rinker@gmail.com
library(regexr) thefuns <- readLines("inst/functions_table/functions.R") cat(paste(thefuns, collapse="\n"))
library(regexr)
The construct
function creates an object of the class regexr
. This is a character string with meta expression information (i.e., sub-expressions with corresponding names and comments) contained in the object's attributes.
The %:)%
binary operator allows the user to optionally add comments to the sub-expressions. The %:)%
, containing a smiley face emoticon, is used here because commented code/sub-expressions is happy code☺.
m <- construct( space = "\\s+" %:)% "I see", simp = "(?<=(foo))", or = "(;|:)\\s*" %:)% "comment on what this does", is_then = "[ia]s th[ae]n" ) m
To see a larger script of a regular expession managed by regexr for the qdapRegex package CLICK HERE.
regexr
ObjectTh generic summary
function provides an integrated view the sub-expressions with corresponding comments and names which make up the concatenated expression.
summary(m)
regexr
ObjectThe unglue
function splits the concatenated regexr
expression into sub-expressions.
unglue(m)
regexr
Object.The subs
, comments
, and names
functions allow the user to view and alter the sub-expressions, comments, and names of the sub-expressions from a regexr
object.
subs(m) comments(m) names(m) subs(m)[4] <- "(FO{2})|(BAR)" comments(m)[4] <- "Look for FOO or BAR" names(m)[4] <- "foo_bar" summary(m)
The test
function allows the user to check if the concatenated regexr
expression and sub-expressions are valid regular expressions.
test(m) subs(m)[5:7] <- c("(", "([A-Z]|(\\d{5})", ")") test(m)
regexr
: Reverse Constructionas.regexr
allows the user to construct regexr
objects from a regular expression and in the process generate an auto-commented & named sub-expressions construct
script.
library("qdapRegex") (myregex <- grab("@rm_time")) out <- as.regexr(myregex) summary(out)
We can use get_construct
to extract an auto-commented & named construct
script that can be optionally altered and used to construct
a regexr
object.
get_construct(out)
Some may prefer that the construct
script contains no names and/or comments. The user may also wish to place comments indented below the sub-expressions or names outdented and above the sub-expressions.
myregex2 <- "(\\s*[a-z]+)([^)]+\\))" get_construct(as.regexr(myregex2, comments.below=TRUE, names.above = TRUE)) get_construct(as.regexr(myregex2, names = FALSE))
Richard Cotton maintains the rebus
package to provide natural language based functions and constants that can be used to generate regular expressions. His work can be utilized within the regexr framework to maintain manageable commented and named sub-expressions.
install.packages("richierocks/rebus") library(rebus) out <- construct( year = YEAR %:)% "a year", or = "|" %:)% "or", min = ":" %c% MINUTE %:)% "colon followed by valid minutes" )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.