This work is funded by the National Science Foundation grant NSF-IOS 1546858.
You probably don't need to read this. Certainly you should read the
introduction vignette first.
This vignette is under construction
This vignette consists of four parts. First I will describe the monad
hidden in the R runtime. Second, I will describe how
rmonad can serve as a
replacement. Third, I will contrast the monadic pipelines of
rmonad with the
compositional pipelines of
magrittr. Finally, I will discuss
rmonad from the
I will introduce the concept of a monad incrementally through the first three
sections. However, monads in the programming context are notoriously difficult
to understand. If you are not familiar with them, you may try studying a few
online tutorials first. That said,
rmonad can be used without understanding
sqrt do? You may answer, "return the square root of an input". But
this is not quite right. The function
sqrt maps an input to a set of possible
NaNwith a warning
Every R function maps a pure value to a computational context with possible undefined behavior and side effects. We can describe the action of a function abstractly as
a -> m b
a is the pure input value,
b is the pure output value, and
represents the output context. The
sqrt function can be described as
-> m number. Where
number* represents an input value that should be a
number, but, since R is dynamic, may be anything.
m numeric represents a
context that holds either a number or some effect.
When we build a pipeline, we chain many functions together. Say, for example, we have the expression:
Borrowing a bit of Haskell syntax
sqrt :: numeric* -> m numeric sum :: numeric* -> m numeric
numeric* represents something that should be numeric. Again, since R is
dynamically typed, we have no guaranttees that the input actually is numeric.
Each function takes a pure value and maps to a value wrapped in a context.
m numeric*, but
sqrt wants a
numeric*. We need a
function to mediate this. A function with the form:
bind :: (m numeric*) -> (numeric* -> m numeric*) -> (m numeric) ^ ^ ^ / / / sum(x) sqrt sqrt(sum(x))
We can express this more generally as
bind :: m1 a -> (a -> m2 b) -> m3 b
b are data types.
m3 are contexts. Every bind
operation takes 1) a value in a context (
m1 a) and 2) a function that maps
that value to
m2 b. The prior state
m1, as well as the intermediate state
m2, are in the scope of the
bind function. This allows contextual
information to propagate from the
A monad is a pattern consisting of a context
m, two functions, and three
laws. The functions are
return (not to be confused with the
return used to terminate a function).
bind we have already seen.
takes a pure value and lifts it into a context.
return has the form
a -> m
Before we cover the monad laws, and before we learn exactly what the monad is
rmonad context, we will walk step by step through one example.
uses the infix operator
%>>% corresponds to
The goal of
rmonad is to ditch the existing impure R monad and replace it
with a clean explicit monad.
x %>>% sum %>>% sqrt
The initial %>>% operators acts as both a
bind function. It first evaluates the
m b is dependent on
m a, not just on
bind function can pass
information from one step in the pipeline
m a to the next
m2 b is equal to
m3 b only for the trivial case where the context is
R users normally rely on the R session to automatically perform these binds.
But what exactly is
m? In an R session, the R runtime handles errors. If one
function raises an error, the error is propagated to functions that use its
m is an object, that catches all undefined behavior.
To understand the monadic nature of
rmonad it is useful to compare it to
magrittr, the expression
x %>% foo %>% bar is the same as
bar(foo(x)). From a monadic point of view,
x is first implicitly raised
into an 'Identity' monad (which is quite formless here)
m1 x. Then
bind :: m1 x -> (x -> m2 y) -> m3 y
bind function above executes
m2 y. It then
reduces this to
m3 y in the presence of
magrittr passes no
m3. The pipeline is context indepent.
rmonad, the pipeline
x %>>% foo %>>% bar will pass a record of past
events at each bind operation, thus incrementally building a graph of the
rmonad is a nod to
Xmonad (no relation to the Restricted Monad
Xmonad wraps the X window system,
rmonad wraps evaluation
of R expressions.
Only a few R expressions are pure. If a function is given an invalid input at
sqrt("wtf")), it will die with a message printed to stderr.
rmonad wraps all R calls in a monad, intercepting all messages, so that the
result of a computation is returned as a pure object.
The 'R monad' is one monad to rule the all. There is no monad stack and no support for monad transformers. In addition to error handling, The monad stores the history of every previous operation. It also performs basic benchmarking, recording the time required for each operation and the size of the returned object. All this weight might seem like a performance killer, but R programmers are used to function calls being slow, so if they care about performance, they wouldn't use a function in a tight loop anyways.
return function is a little complex in
rmonad. It is a special case of
as_monad = a -> m b
a can be one of three types
an unevaluated R expression -
as_monad evaluates the expression,
capturing any exceptions, warnings or messages.
as_monad is used inside
bind function in this capacity.
a pure R value - acts as
%>>% operator is like the Haskell
>>= operator, but with some of the
sloppiness expected of a dynamic language:
%>>% :: m a -> (a -> m b) -> m b | a -> (a -> m b) -> m b | (a -> r b) -> (a -> m b) -> m b
%>>% differs from
>>= in that
%>>% automatically loads the left-hand-side
value into a monad.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.