knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

We all know R is not the place to write for-loops, but some of us really, really like them. This package is inspired by Python's list comprehensions which make it possible to write for-loops in only a few keystrokes. Note that this package makes for-loops faster to write, not faster to run.

You can install the most recent version of yuck from GitHub:

devtools::install_github("tpq/yuck")
library(yuck)
library(yuck)

Single loop comprehensions

A list comprehension allows a user to step through a for-loop and save each result in one line of code, similar to what lapply does. The yuck package introduces the := operator to trigger a list comprehension by computing the expression on the right and saving it to the value on the left.

a := for(i in 1:5) i^2
a

List comprehensions work with with arbitrary sequences.

index <- seq(1, 9, by = 2)
a := for(i in 1:length(index)) index[i]^2
a := for(i in index) i^2
a

List comprehensions work with booleans.

a := for(i in 1:5) i %in% 1:4
a

List comprehensions work with strings.

a := for(i in 1:5) paste("this is how we comprehend", i)
a

Nested loop comprehensions

The yuck package allows for nested loop comprehensions.

a := for(i in 1:5) for(j in 1:7) (i - 1)^2 + (j - 1)^2
a

We can save the results of a nested loop comprehension as a matrix.

matrix(a, length(1:7), length(1:5))

Nested loop comprehensions work with booleans.

a := for(i in c(0, 1, 0, 1)) for(j in c(0, 1)) i & j
a <- matrix(a, 2, 4)
colnames(a) <- as.character(c(0, 1, 0, 1))
rownames(a) <- as.character(c(0, 1))
a

Nested loop comprehensions work with strings.

a := for(i in c(1, 2, 3)) for(j in c("ok now", "whooo")) paste("yes", i, j)
a

Performance

Loops are slow in R, and so too are list comprehensions. Behind the scenes, := pre-allocates RAM to store the results of a for-loop. This makes list comprehensions less slow, but certainly not fast. By default, lapply is actually faster.

microbenchmark::microbenchmark(
  a := for(i in 1:100000) i^2,
  a <- lapply(1:100000, function(i) i^2),
  times = 5L
)

The yuck package tries to return the result as a numeric, logical, or character vector. If it cannot, it returns the results as a list.

a := for(i in 1:5) rep("A", i)
a

Known issues

List comprehensions use regular expressions to parse the for-loop and save each result. The user should consider the terms for, in, ~~~, and out9000 as protected, meaning that a list comprehension may fail if these terms are present extraneously in an expression.

# will fail
a := for(i in c("for ", "fail")) i
a := for(i in c("in )", "fail")) i

Although you can have nested loop comprehensions, you cannot have a loop comprehension within a loop comprehension (or a for-loop within a function call).

# will fail
a := for(i in 1:3) b := for(j in 1:3) i * j

You can, however, have an sapply within a loop comprehension.

a := for(i in 1:3) i * sapply(1:i, sum)
a

Use cases

The examples above are arbitrary and the problems presented have better (i.e., vectorized) solutions. Rather, the advantage of yuck comes from its use in combination with others tools. For example, we could use yuck to iterate across multiple machine learning parameters, models, and predictions. We can also combine yuck with magrittr to comprehend pipes.

library(e1071)
library(magrittr)
data(iris)
groups <- split(sample(1:nrow(iris)), 1:2)
train <- iris[groups[[1]],]
test <- iris[groups[[2]],]
confs := for(kernel in c("linear", "radial", "polynomial"))
  svm(Species ~ ., train, kernel = kernel) %>%
  predict(test) %>% table(test$Species)
confs

I like using yuck to measure CPU time and RAM overhead.

library(peakRAM)
times := for(i in c(1e5, 1e6, 1e7)) data.frame(i,
  peakRAM(
    function() 1:i,
    1:i,
    1:i + 1:i,
    1:i * 2
  ))[,-3]
times


tpq/yuck documentation built on May 27, 2019, 1:09 p.m.