text_line_dataset: A dataset comprising lines from one or more text files.

View source: R/text_line_dataset.R

text_line_datasetR Documentation

A dataset comprising lines from one or more text files.

Description

A dataset comprising lines from one or more text files.

Usage

text_line_dataset(
  filenames,
  compression_type = NULL,
  ...,
  buffer_size = NULL,
  num_parallel_reads = NULL,
  name = NULL,
  record_spec = NULL,
  parallel_records = NULL
)

Arguments

filenames

String(s) specifying one or more filenames

compression_type

A string, one of: NULL (no compression), "ZLIB", or "GZIP".

...

unused, must be empty.

buffer_size

(Optional.) A tf.int64 scalar denoting the number of bytes to buffer. A value of 0 results in the default buffering values chosen based on the compression type.

num_parallel_reads

(Optional.) A tf.int64 scalar representing the number of files to read in parallel. If greater than one, the records of files read in parallel are outputted in an interleaved order. If your input pipeline is I/O bottlenecked, consider setting this parameter to a value greater than one to parallelize the I/O. If NULL, files will be read sequentially.

name

(Optional.) A name for the tf.data operation.

record_spec

(Optional) Specification used to decode delimimted text lines into records (see delim_record_spec()).

parallel_records

(Optional) An integer, representing the number of records to decode in parallel. If not specified, records will be processed sequentially. This is only applicable if record_spec is provided

Value

A dataset


rstudio/tfdatasets documentation built on April 13, 2025, 6:50 p.m.