ReservoirSampler: Create a streamer for producing a random sample from a...

Description Format Methods Examples

Description

ReservoirSampler creates a streaming algorithm that can be used to obtain a random sample from a population that is too large to fit in memory. The samples can be made reproducible can be using 'set.seed(...)' before initialising the streamer.

Implementation is based on doi:10.1145/198429.198435.

Format

An R6Class generator object

Methods

Public methods


Method new()

Creates a new ReservoirSampler streamer object.

Usage
ReservoirSampler$new(k)
Arguments
k

the desired sample size

Returns

The new ReservoirSampler (invisibly)


Method update()

Update the ReservoirSampler streamer object.

Usage
ReservoirSampler$update(x)
Arguments
x

values to be added to the stream

Returns

The updated ReservoirSampler (invisibly)


Method clone()

The objects of this class are cloneable with this method.

Usage
ReservoirSampler$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples

1
2
3
4
5
6
sampler <- ReservoirSampler$new(k = 10)
for (i in 1:100) {
    sampler$update(i)
}
length(sampler$value)  # random sample from 1:100 of size 10
#> [1] 10

THargreaves/online-oceanarium documentation built on Jan. 13, 2022, 10:39 p.m.