This vignette demonstrates functions in the utils range.

library(shrt)

catv

catv is a wrapper for cat that prints messages to the console. Unlike the original, this wrapper can be active or silent depending on the value of a verbose setting.

The usual way to control output depending on the value of a verbose variable is through an if statement, for example,

verbose = TRUE
if (verbose) {
   catv("yes, verbose\n")
}

The catv wrapper accepts an explicit argument to toggle verbose mode,

catv("no, not verbose\n", verbose=FALSE)
catv("yes, verbose\n", verbose=TRUE)

When the verbose argument is not specified, the verbosity status is inferred from parent environment or from the global options. Thus, it is possible to toggle verbosity independently of the function call to produce output.

verbose=FALSE
catv("this will not appear")
catv("this will not appear either")
verbose=TRUE
catv("Now verbose mode is on, so this will appear\n")

Future catv statements will also produce output on the console.

grepf

grepf is a search utility looking for patterns within files. It reads files within a directory and finds patterns within each of them, returning the position of the match within each file.

grepf("search", path=".", file.pattern="Rmd$")

grepv

grepv is a pattern matching function, a wrapper for grep, that returns the actual matching values. For example, let's start with a vector of strings

greek = c("alpha", "beta", "gamma", "delta")

If we need all items that contain the pattern a[lm], we can use grep,

grep("a[lm]", greek, value=T)

The grepv function syntax is

grepv("a[lm]", greek) 

h1, h2, h3

These three functions are wrappers for head with preset values for n. For example

h1(letters)
h2(letters)
h3(letters)

lengrep

lengrep is equivalent to grep followed by length. It returns the number of hits in a grep query.

items = c("foo1", "foo2", "bar")
lengrep("foo", items)

lenu

lenu is equivalent to unique followed by length. It returns the number of unique elements in a vector.

foo = c("f", "o", "o")
lenu(foo)

linechars

linechars reads the contents of text files and counts the number of character per line.

This is useful during code development to identify long lines in code. The function can process all text files, or a subset defined by a regex pattern. It can report all line numbers, or use a threshold.

linechars("directory", file.pattern="Rmd", min.chars=80)

lst

lst creates a new list object and automatically attaches names to its components.

Consider outputing several data objects from a function. The usual way to do that would be to create a named list, often by repeating the components names that are already defined within the function.

a = 1
b = 2
list(a=a, b=b)

The lst command simplifies the syntax.

lst(a, b)

The function also takes named arguments, e.g. lst(a, b, z=3).

mtrx

mtrx creates a matrix from input data. Compared to the usual matrix, it provides some new possibilities in naming rows and columns.

There are two ways to create a matrix with two rows and four named columns. The first is multi-step

mm = matrix(0, nrow=2, ncol=4)
colnames(mm) = greek
mm

A second way achieves the same in a single line

matrix(0, nrow=2, ncol=4, dimnames=list(rep(NULL, 2), greek))

However, this last expression is not very elegant. It requires a null component in the dimnames list and the lengths of the row and column names must match the integers specified by nrow and ncol.

The new mtrx function achieves in a slightly more compact form

mtrx(0, nrow=2, col.names=greek)

Note how the number of columns is automatically determined by the specified names (this is not possible using matrix). Besides this usage, mtrx also accepts names for rows.

namesF, namesNA, namesT, namesV

The four functions namesF, namesNA, namesT, and namesV are variations on a theme. They act on a vector and provide the names of elements that match particular values.

Consider a named vector with boolean values

foo.boolean = setNames((1:10)%%2==0, letters[1:10])
foo.boolean

If we want to know which of letters is associated with a true value, we can match the pattern and fetch the names.

names(foo.boolean)[foo.boolean==TRUE]

With the namesT function, we can instead write

namesT(foo.boolean)

Note how we avoid writing the vector twice! The function namesF behaves similarly but returns the names of elements that are marked as false.

Consider now that the vector contains numbers or strings.

foo.integer = setNames(c(2,4,NA,0,-1,3,NA,7,3,NA), LETTERS[1:10])
foo.integer

Suppose we want to know the names that match a certain value, say 3. Finding these is not hard, but it is important to handle NAs carefully.

names(foo.integer)[foo.integer==3]
names(foo.integer)[foo.integer %in% 3]

The function namesV provides the latter implementation with a simple syntax.

namesV(foo.integer, 3)

Finally, namesNA provides element names that hold NAs.

names(foo.integer)[is.na(foo.integer)]
namesNA(foo.integer)

newv

newv creates a new vector of specified length, possibly containing names.

For some data types like integers or characters, we can create a small vector with rep and then name the elements with names. For example

v = rep(0, 4)
names(v) = greek
v

The new function can create the same object in a single line.

newv("integer", names=greek)

The new function is also handy for creating lists. The traditional way to create a list of a known size is

mylist = vector("list", length=4)
setNames(mylist, greek)

The new function reproduces this in a single line.

newv("list", names=greek)

Note that the length of the output object is determined by the names.

nlist

nlist creates a named list using an existing vector or data frame.

The usual way to create a named list is a two-step process: first create the list using as.list and then set names for its elements.

mylist = setNames(as.list(letters[1:2]), letters[1:2])
mylist

Function nlist simplifies this to a single command

mylist = nlist(letters[1:2])
mylist

The function also allows other uses that start from a data frame or a matrix. That functionality works similarly as nvec described below.

nvec

nvec creates a named vector using two columns from a data frame.

The usual way to use matrix/data-frame to create a named vector is a two-step process: first extract values, then assign names.

my.data.frame = data.frame(id=c("A", "B"), val=c(2, 6))
myvec = my.data.frame[, "val"]
names(myvec) = my.data.frame[, "id"]
myvec

The assignment can be written on one line with setNames, but still requires identifying the data frame twice. The nvec function is shortcut that requires writing the name of the data frame only once.

myvec = nvec(my.data.frame, "val", "id")
myvec

The function also checks existence of columns and ensures that the names of the vector are unique.

p0

p0 is equivalent to paste0. It concatenates its input without adding any spaces or other characters.

paste0(greek[1], greek[2])
p0(greek[1], greek[2])

today

today returns a current date in a compact character string.

today()

x2df

x2df is a conversion function that coerces an input object into a data frame. It is shorthand notations for calls to data.frame. For example, let's start with a matrix of characters.

charmat = mtrx(letters[1:9], col.names=c("x", "y", "z"), nrow=3)
charmat

The conversion produces a related object

chardf = x2df(charmat)
chardf

The difference from the input is the class.

class(chardf)
apply(chardf, 2, class)

The default behavior is to convert characters into characters, but it is also possible to set stringsAsFactors to true during the function call.

In addition to such simple conversion, the function can also convert vectors into a data frame.

myvec = setNames(1:4, letters[1:4])
x2df(myvec)


tkonopka/shrt documentation built on March 5, 2020, 2:51 p.m.