dataset_interleave: Maps map_func across this dataset, and interleaves the...

View source: R/dataset_methods.R

dataset_interleaveR Documentation

Maps map_func across this dataset, and interleaves the results

Description

Maps map_func across this dataset, and interleaves the results

Usage

dataset_interleave(
  dataset,
  map_func,
  cycle_length = NULL,
  block_length = 1,
  num_parallel_calls = NULL,
  deterministic = NULL,
  name = NULL
)

Arguments

dataset

A dataset

map_func

A function mapping a nested structure of tensors (having shapes and types defined by output_shapes() and output_types() to a dataset.

cycle_length

The number of elements from this dataset that will be processed concurrently.

block_length

The number of consecutive elements to produce from each input element before cycling to another input element.

num_parallel_calls

(Optional.) If specified, the implementation creates a threadpool, which is used to fetch inputs from cycle elements asynchronously and in parallel. The default behavior is to fetch inputs from cycle elements synchronously with no parallelism. If the value tf.data.AUTOTUNE is used, then the number of parallel calls is set dynamically based on available CPU.

deterministic

(Optional.) When num_parallel_calls is specified, if this boolean is specified (TRUE or FALSE), it controls the order in which the transformation produces elements. If set to FALSE, the transformation is allowed to yield elements out of order to trade determinism for performance. If not specified, the tf.data.Options.deterministic option (TRUE by default) controls the behavior.

name

(Optional.) A name for the tf.data operation.

Details

The cycle_length and block_length arguments control the order in which elements are produced. cycle_length controls the number of input elements that are processed concurrently. In general, this transformation will apply map_func to cycle_length input elements, open iterators on the returned dataset objects, and cycle through them producing block_length consecutive elements from each iterator, and consuming the next input element each time it reaches the end of an iterator.

See Also

Other dataset methods: dataset_batch(), dataset_cache(), dataset_collect(), dataset_concatenate(), dataset_decode_delim(), dataset_filter(), dataset_map(), dataset_map_and_batch(), dataset_padded_batch(), dataset_prefetch(), dataset_prefetch_to_device(), dataset_rebatch(), dataset_reduce(), dataset_repeat(), dataset_shuffle(), dataset_shuffle_and_repeat(), dataset_skip(), dataset_take(), dataset_take_while(), dataset_window()

Examples

## Not run: 

dataset <- tensor_slices_dataset(c(1,2,3,4,5)) %>%
 dataset_interleave(cycle_length = 2, block_length = 4, function(x) {
   tensors_dataset(x) %>%
     dataset_repeat(6)
 })

# resulting dataset (newlines indicate "block" boundaries):
c(1, 1, 1, 1,
  2, 2, 2, 2,
  1, 1,
  2, 2,
  3, 3, 3, 3,
  4, 4, 4, 4,
  3, 3,
  4, 4,
  5, 5, 5, 5,
  5, 5,
)


## End(Not run)


rstudio/tfdatasets documentation built on April 13, 2025, 6:50 p.m.