Description Details Author(s) See Also Examples
Large data files can be difficult to work with in R, where data
generally resides in memory. This package encourages a style of
programming where data is 'streamed' from disk into R through a series
of components that, typically, reduce the original data to a
manageable size. The package provides useful
Producer
and Consumer
components for operations such as data input, sampling, indexing, and
transformation.
The central paradigm in this package is a Stream
composed of a
Producer
and zero or more
Consumer
components. The Producer
is
responsible for input of data, e.g., from the file system. A
Consumer
accepts data from a Producer
and performs
transformations on it. The Stream
function is used to
assemble a Producer
and zero or more Consumer
components
into a single string.
The yield
function can be applied to a stream to
generate one ‘chunk’ of data. The definition of chunk depends on the
stream and its components. A common paradigm repeatedly invokes
yield
on a stream, retrieving chunks of the stream for further
processing.
Martin Morgan mtmorgan@fhcrc.org
Producer
, Consumer
are the
main types of stream components. Use Stream
to connect
components, and yield
to iterate a stream.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ## About this package
packageDescription("Streamer")
## Existing stream components
getClass("Producer") # Producer classes
getClass("Consumer") # Consumer classes
## An example
fl <- system.file("extdata", "s_1_sequence.txt", package="Streamer")
b <- RawInput(fl, 100L, reader=rawReaderFactory(1e4))
s <- Stream(RawToChar(), Rev(), b)
s
head(yield(s)) # First chunk
close(b)
b <- RawInput(fl, 5000L, verbose=TRUE)
d <- Downsample(sampledSize=50)
s <- Stream(RawToChar(), d, b)
s
s[[2]]
## Processing the first ten chunks of the file
i <- 1
while (10 >= i && 0L != length(chunk <- yield(s)))
{
cat("chunk", i, "length", length(chunk), "\n")
i <- i + 1
}
close(b)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.