irecord: Record and replay iterators

Description Usage Arguments Examples

Description

The irecord function records the values issued by a specified iterator to a file or connection object. The ireplay function returns an iterator that will replay those values. This is useful for iterating concurrently over multiple, large matrices or data frames that you can't keep in memory at the same time. These large objects can be recorded to files one at a time, and then be replayed concurrently using minimal memory.

Usage

1
2
irecord(con, iterable)
ireplay(con)

Arguments

con

A file path or open connection.

iterable

The iterable to record to the file.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
suppressMessages(library(foreach))

m1 <- matrix(rnorm(70), 7, 10)
f1 <- tempfile()
irecord(f1, iter(m1, by='row', chunksize=3))

m2 <- matrix(1:50, 10, 5)
f2 <- tempfile()
irecord(f2, iter(m2, by='column', chunksize=3))

# Perform a simple out-of-core matrix multiply
p <- foreach(col=ireplay(f2), .combine='cbind') %:%
       foreach(row=ireplay(f1), .combine='rbind') %do% {
         row %*% col
       }

dimnames(p) <- NULL
print(p)
all.equal(p, m1 %*% m2)
unlink(c(f1, f2))

Example output

Loading required package: iterators
           [,1]       [,2]       [,3]       [,4]       [,5]
[1,]  14.389379  28.963087   43.53680   58.11050   72.68421
[2,]  22.949290  58.101356   93.25342  128.40549  163.55756
[3,] -26.797911 -78.250382 -129.70285 -181.15532 -232.60779
[4,]   2.683244  -6.604943  -15.89313  -25.18132  -34.46950
[5,] -24.904868 -78.676279 -132.44769 -186.21910 -239.99051
[6,]  19.658555  51.927391   84.19623  116.46506  148.73390
[7,]  31.679011  96.090273  160.50153  224.91280  289.32406
[1] TRUE

itertools documentation built on May 2, 2019, 2:26 p.m.