The 'memo' package

memo

======

knitr::opts_chunk$set(cache=FALSE)

The memo package implements a simple in-memory cache for the results of a function. If you have an expensive function that is being called repeatedly with the same inputs, memo can help.

Fibonnacci example

fib <- function(n) if (n <= 1) 1 else fib(n-1) + fib(n-2)
sapply(0:9, fib)

This recursive implementation corresponds closely to the way the sequence is defined in math texts, but it has a performance problem. The problem is that as you ask for values further down the sequence, the computation becomes inordinately slow due to recursion. To demonstrate the issue, we can try counting every time fib is called:

count <- 0
fib <- function(n) {
  count <<- count+1
  if (n <= 1) 1 else fib(n-1) + fib(n-2)
}

counted_fib <- function(n) {
  count <<- 0
  c(n=n, result=fib(n), calls=count)
}

t(sapply(0:16, counted_fib))

The number of calls increases unreasonably. This is because, for instance, fib(6) calls both fib(5) and fib(4), but fib(5) also calls fib(4). The second call to fib(4) is wasted work. And this pattern goes on -- the two calls to fib(4) lead to four calls to fib(2). Every time you increment n by one, the number of calls roughly doubles. (Clearly, there are more efficient algorithms for computing the Fibbonacci sequence, but this is a toy example, where fib stands in for some expensive function that is being called repeatedly.)

One way to cut down on wasted effort would be to check whether fib(n) has already been computed for a given n. If it has, fib can just return that value instead of starting over. This is called "memoizing." The memo package can automatically create a memoized version of a given function, just by wrapping the function definition in memo():

library(memo)

count <- 0
fib <- memo(function(n) {
  count <<- count+1
  if (n <= 1) 1 else fib(n-1) + fib(n-2)
})

counted_fib(16)

Now, computing fib(16) only takes 17 calls. And if we call again, it remembers the previous answer and doesn't make any new calls:

counted_fib(16)

Each successive value then only takes two calls:

t(sapply(17:30, counted_fib))

The tradeoff for this speedup is the memory used to store previous results. By default memo will remember the 5000 most recently used results; to adjust that limit you can change the cache option:

fib <- memo(cache=lru_cache(5000), function () {...})

The Fibonacci sequence being kind of a toy example, memoization has a variety of uses, such as:



Try the memo package in your browser

Any scripts or data that you put into this service are public.

memo documentation built on May 29, 2024, 4:18 a.m.