When creating syntax, one has to ask themselves about the naming scheme: should I make the functions short for typing efficiency, or long for increased readability? Ruby has the former benefit, but sometimes the methods can be difficult to remember (e.g. is it len
or length
? Is it swapcase
or swap_case
?), as there isn't a consistent naming scheme--however, some functions have synonyms to help those from other programming languages learn Ruby faster (e.g. reduce
and inject
do the same thing). On the other hand, the stringr
library has a consisent naming scheme for its functions, but does not have synonyms, so you are forced to learn the stringr
way. Thirdly, and perhaps tagentially, R does not have concatenation operator (only functions) like in Ruby and BASIC, which is odd, as many situations require concatenation; so using the paste/paste0()
functions can make code less readable. As such, I am introducing a new package to take these considerations into account: stringops
, a work-in-progress library consisting of tools for processing strings in R.
What this package brings are (1) a consistent naming-scheme for functions, (2) synonyms for said functions, and (3) a concatenation operator. The first item benefits users of all skill levels, as it makes certain functions easier to remember while making use of RStudio's predictive text. The second item is useful when one tires of typing string_cull()
, for example, and wishes to use a shorthand to simplify the code (in this case, the shorthand would be cull()
). The third item's benefit is more readable code by avoiding the function syntax of paste/paste0()
. Ultimately, these items will hopefully make processing strings in R more fun for the user!
This package currently is only available on GitHub–there are no plans to submit this package to CRAN at this time. As such, please use the devtools
library to install stringops
.
# install.packages('devtools') devtools::install_github('robertschnitman/stringops')
library(stringops)
The functions in this chapter allow you to modify the case of the string.
string_lower()
synonyms: s_lower()
, lower()
The string_lower()
function makes all characters in a string vector in their lower case.
string_lower('UPPER')
string_lower(rownames(mtcars))
string_upper()
synonyms: s_upper()
, upper()
The string_upper()
function make all characters in a string vector in their capitalized case.
string_upper('lower')
string_upper(rownames(mtcars))
string_swapcase()
synonyms: s_swapcase()
, swapcase()
The string_swapcase()
function switches the case for each character in a string vector.
string_swapcase('strangeCASING')
string_swapcase(rownames(mtcars))
string_titlecase()
synonyms: s_titlecase
, titlecase()
The string_titlecase()
function converts the first character of each word to uppercase.
strings <- c('and then there were none', 'silent hill: revelations', 'the lightning thief') string_titlecase(strings)
The functions in this chapter cut off (empty) strings.
string_chomp()
synonyms: s_chomp()
, chomp()
Remove all whitespace characters from a string.
string_chomp(rownames(mtcars))
string_chop()
synonyms: s_chop()
, chop()
Remove the last character from a string.
string_chop(rownames(mtcars))
string_cut()
synonym: s_cut()
Remove characters from either the left, right, or both ends of a string.
string_cut("cut this please", which = 'left') string_cut("cut this please", which = 'right') string_cut("cut this please", which = 'both')
string_trim()
synonyms: s_trim()
, trim()
Remove whitespace characters from a specific direction. Acts the same as trimws()
, but the source code is more readable.
s <- " s " string_trim(s, 'left') string_trim(s, 'right') string_trim(s, 'both')
trimws()
vs. string_trim()
trimws string_trim
The functions in this chapter compares strings, outputting a Boolean vector.
%like%
The %like%
operator tests whether a pattern match is found in a string.
rownames(mtcars) %like% "^M"
string_isblank()
synonyms: s_isblank()
, isblank()
, is.blank()
The function string_isblank()
tests whether a vector consists of only blank characters.
string_isblank(" ") string_isblank(" string ")
string_isupper()
/string_islower()
synonyms: s_isupper()
, isupper()
, is.upper()
, s_islower()
, islower()
, is.lower()
The functions string_isupper()
and string_islower()
test whether each element in a string vector is all uppercase or lowercase, respectively. The required input for both functions is a character vector. The output is a Boolean vector. These functions are useful for pattern matching acronyms, uppercase, and lowercase elements.
chr <- c('TEST', 'test', 'tEsT') is.upper(chr) is.lower(chr)
The functions in this chapter allow you to concatenate strings together.
%&%
(concatenation operator)The %&%
operator acts similar to BASIC's &
, concatenating two elements together.
'a' %&% 'b'
"Car: " %&% rownames(mtcars)
string_prefix()
synonyms: s_prefix()
, prefix()
string_prefix()
appends a string to the beginning of another string.
string_prefix(rownames(mtcars), "A ")
string_suffix()
synonyms: s_suffix()
, suffix()
string_suffix()
appends a string to the end of another string.
string_suffix(rownames(mtcars), " model")
The functions in this chapter focus on culling (extracting) strings based on patterns and positions.
string_cull()
synonyms: s_cull()
, cull()
, string_extract()
, s_extract()
, extract()
The function string_cull()
culls (or extracts) a pattern from a string vector--if no pattern is found, \code{NA} is returned.
# extract beginning "M" from each element. string_cull(rownames(mtcars), '^M')
# extract elements in full that begin with "M". string_cull(rownames(mtcars), "^M.*")
string_left/right/mid()
synonyms: s_left/right/mid()
, left/right/mid()
The string_left()
, string_right()
, and string_mid()
functions act the same as Excel's LEFT()
, RIGHT()
, and MID()
functions, respectively.
string_left(rownames(mtcars), 3) string_right(rownames(mtcars), 3) string_mid(rownames(mtcars), 2, 4)
string_countm()
synonym: s_countm()
, countm()
The function string_countm()
counts the number of matches in a given string. Optional inputs are passed to gregexpr()
.
# Count the number of times "a" appears rowname of mtcars. string_countm(rownames(mtcars), 'a')
string_find()
synonyms: s_find()
, find()
The function string_find()
acts similar to grep(string, x, value = TRUE)
: it subsets a vector to found pattern matches, returning the full element.
# subset for rownames starting with "M". string_find(rownames(mtcars), '^M')
string_findi()
synonyms: s_findi()
, findi()
The function string_findi()
acts similar to grep(string, x)
: it subsets a vector to found pattern matches, returning the indices.
# Produce the indices of the rownames starting with "M". string_findi(rownames(mtcars), '^M')
string_findl()
synonyms: s_findl()
, findl()
The function string_findl()
acts similar to grepl(string, x)
: it returns TRUE
if a pattern match has been found, FALSE
otherwise.
# Detect whether a rowname starts with "M". string_findl(rownames(mtcars), '^M')
string_findm()
synonyms: s_findm()
, findm()
The string_findm()
function finds pattern matches and returns NA
if none are found.
# Return matches with their full value; NA otherwise. string_findm(rownames(mtcars), '^M')
string_remove()
synonyms: s_remove()
, remove()
string_remove()
replaces a given string with a blank character.
# Remove beginning "M". string_remove(rownames(mtcars), "^M")
string_replace()
synonyms: s_replace()
, find_replace()
, fr()
, search_replace()
, sr()
string_replace()
is a simplified gsub()
: it finds a pattern and replaces it with a given string.
# search and replace beginning "M" with "Z". string_replace(rownames(mtcars), "^M", "Z")
The functions in this chapter joins (collapses) a vector of elements into a single string.
string_join()
synonyms: s_join()
, join()
Join (collapse) a vector of elements into a single string.
# Join vector, separating elements by a comma and space. string_join(rownames(mtcars), ", ")
The functions in this chapter do not have common functionalities with the previous chapters: they deal with a myriad of circumstances.
string_dup()
synonyms: s_dup()
, dup()
string_dup()
acts the same as strrep()
: it repeats a string n
number of times.
string_dup(rownames(mtcars), 3)
string_insert()
synonyms: s_insert()
, insert()
string_insert()
inserts a string at a specified position.
# Insert "Z" at position 3. string_insert("abcd", "Z", 3)
string_len()
synonyms: s_len()
, len()
stirng_len()
acts the same as nchar()
: it counts the number of characters per vector element.
string_len(rownames(mtcars))
string_reverse()
synonyms: s_reverse()
, reverse()
string_reverse()
reverses the characters in a string.
string_reverse(rownames(mtcars))
string_split()
synonym: s_split()
string_split()
acts the same as strsplit()
: it splits a string by a specified character.
# Split the strings by a space. string_split(rownames(mtcars), " ")
Hopefully, this package's API, synonyms, and concatenation operator make processing strings in R more fun for you! You can find these functions and their source code at the Github repository, stringops
.
AutoIt. Operators. https://www.autoitscript.com/autoit3/docs/intro/lang_operators.htm
Excel. LEFT()
. https://support.microsoft.com/en-us/office/left-leftb-functions-9203d2d2-7960-479b-84c6-1ea52b99640c
Gagolewski, Marek. stringi
. http://www.gagolewski.com/software/stringi/
Ruby. String Class. https://ruby-doc.org/core-2.7.1/String.html
Tidyverse. stringr
. https://stringr.tidyverse.org/
stringops
GitHub Page. https://github.com/robertschnitman/stringops
Robert Schnitman. Personal Webpage. https://robertschnitman.netlify.app/
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.