Description Usage Arguments Value Additional Info Author(s) References See Also Examples
RStorm
provides the main functionality of the RStorm package. The RStorm
function is used to run a stream defined using a Topology. The Topology defines the spout (the data-source for the stream) and the order of processing units (bolts). See example below and in the main package description for examples of the usage of RStorm
.
More details of the package, examples of streaming algorithms, and examples of the use of RStorm can be found at http://software.mauritskaptein.com/RStorm
1 |
topology |
a topology object specified using |
.verbose |
a logical indicator for verbose output. Default is TRUE. |
.debug |
a logical indicator for debug mode. If in debug mode all objects stored during the running of the stream will persist in memory and can be accessed using standard calls to |
.batches |
a number. Determines the size of batches processed by a stream. While RStorm simulates streaming processing, in actuality the rows of the data.frame defined in the spout are iterated through in batches to prevent memory overflow when the spout contains a large number of rows. This argument sets the size of these batches and with it limits the size of memory allocated to emitted data during the stream. Default batch size is 100. |
... |
additional arguments to pass to (e.g.) bolts or plotting functions. |
An object of type RStorm
which is a list containing the following elements:
data |
a list containing all the hashmaps stored during the stream using |
track |
a list containing all the data.frames stored using the |
finalize |
the result of the finalize function. If a finalize function is added to the Topology this field will contain whatever was returned by the finalize function and can be accessed directly using |
The is.RStorm
function checks whether an object is of Type RStorm and is used internally.
Maurits Kaptein
http://software.mauritskaptein.com/RStorm
See Also: Topology
, Bolt
, Tuple
, Emit
, TrackRow
, SetHash
, GetHash
, GetTrack
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | # Run a simple RStorm. First, create some data:
x <- seq(1, 1000)
# Second, we start defining the topology
topology <- Topology(data.frame(x=x))
# Third, we define a bolt.
# This bolt computes the sum of a number stored
# in a local Hashmap and the Tuple (x) that is received
computeSum <- function(x, boltID, ...){
sum <- GetHash(paste("sum", boltID))
if(is.data.frame(sum)){
x <- sum + (x[1])
}
SetHash(paste("sum", boltID), x)
Emit(Tuple(x=x), ...)
}
# Add the bolts to the topology.
# Here the first bolt computes the sum of the sequence
# and the second bolt computes the sum of summed elements
topology <- AddBolt(topology, Bolt(computeSum, listen=0, boltID=1))
topology <- AddBolt(topology, Bolt(computeSum, listen=1, boltID=2))
result <- RStorm(topology)
print(GetHash("sum 1", result))
print(GetHash("sum 2", result))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.