spark_read_tfrecord: Read a TFRecord File

Description Usage Arguments Examples

Description

Read a TFRecord file as a Spark DataFrame.

Usage

1
2
spark_read_tfrecord(sc, name = NULL, path = name, schema = NULL,
  record_type = c("Example", "SequenceExample"), overwrite = TRUE)

Arguments

sc

A spark conneciton.

name

The name to assign to the newly generated table or the path to the file. Note that if a path is provided for the 'name' argument then one cannot specify a name.

path

The path to the file. Needs to be accessible from the cluster. Supports the "hdfs://", "s3a://" and "file://" protocols.

schema

(Currently unsupported.) Schema of TensorFlow records. If not provided, the schema is inferred from TensorFlow records.

record_type

Input format of TensorFlow records. By default it is Example.

overwrite

Boolean; overwrite the table with the given name if it already exists?

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
## Not run: 
iris_tbl <- copy_to(sc, iris)
data_path <- file.path(tempdir(), "iris")
df1 <- iris_tbl %>%
ft_string_indexer_model(
  "Species", "label",
  labels = c("setosa", "versicolor", "virginica")
)

df1 %>%
spark_write_tfrecord(
  path = data_path,
  write_locality = "local"
)

spark_read_tfrecord(sc, data_path)

## End(Not run)

sparktf documentation built on May 2, 2019, 10:24 a.m.