read.stream: Load a streaming SparkDataFrame

Description Usage Arguments Details Value Note See Also Examples

Description

Returns the dataset in a data source as a SparkDataFrame

Usage

1

Arguments

source

The name of external data source

schema

The data schema defined in structType or a DDL-formatted string, this is required for file-based streaming data source

...

additional external data source specific named options, for instance path for file-based streaming data source. timeZone to indicate a timezone to be used to parse timestamps in the JSON/CSV data sources or partition values; If it isn't set, it uses the default value, session local timezone.

Details

The data source is specified by the source and a set of options(...). If source is not specified, the default data source configured by "spark.sql.sources.default" will be used.

Value

SparkDataFrame

Note

read.stream since 2.2.0

experimental

See Also

write.stream

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Not run: 
spark_session()
df <- read_stream("socket", host = "localhost", port = 9999)
q <- write_stream(df, "text", path = "/home/user/out", checkpointLocation = "/home/user/cp")

df <- read_stream("json", path = jsonDir, schema = schema, maxFilesPerTrigger = 1)
stringSchema <- "name STRING, info MAP<STRING, DOUBLE>"
df1 <- read_stream("json", path = jsonDir, schema = stringSchema, maxFilesPerTrigger = 1)

## End(Not run)

danzafar/tidyspark documentation built on Sept. 30, 2020, 12:19 p.m.