from_rdd: Construct a TimeSeriesRDD from a Spark RDD of rows

Description Usage Arguments Value See Also Examples

View source: R/sdf_utils.R

Description

Construct a TimeSeriesRDD containing time series data from a Spark RDD of rows

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
from_rdd(
  rdd,
  schema,
  is_sorted = FALSE,
  time_unit = .sparklyr.flint.globals$kValidTimeUnits,
  time_column = .sparklyr.flint.globals$kDefaultTimeColumn
)

fromRDD(
  rdd,
  schema,
  is_sorted = FALSE,
  time_unit = .sparklyr.flint.globals$kValidTimeUnits,
  time_column = .sparklyr.flint.globals$kDefaultTimeColumn
)

Arguments

rdd

A Spark RDD[Row] object containing time series data

schema

A Spark StructType object containing schema of the time series data

is_sorted

Whether the rows being imported are already sorted by time

time_unit

Time unit of the time column (must be one of the following values: "NANOSECONDS", "MICROSECONDS", "MILLISECONDS", "SECONDS", "MINUTES", "HOURS", "DAYS"

time_column

Name of the time column

Value

A TimeSeriesRDD useable by the Flint time series library

See Also

Other Spark dataframe utility functions: collect.ts_rdd(), from_sdf(), spark_connection.ts_rdd(), spark_dataframe.ts_rdd(), spark_jobj.ts_rdd(), to_sdf(), ts_rdd_builder()

Other Spark dataframe utility functions: collect.ts_rdd(), from_sdf(), spark_connection.ts_rdd(), spark_dataframe.ts_rdd(), spark_jobj.ts_rdd(), to_sdf(), ts_rdd_builder()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
library(sparklyr)
library(sparklyr.flint)

sc <- try_spark_connect(master = "local")

if (!is.null(sc)) {
  sdf <- copy_to(sc, tibble::tibble(t = seq(10), v = seq(10)))
  rdd <- spark_dataframe(sdf) %>% invoke("rdd")
  schema <- spark_dataframe(sdf) %>% invoke("schema")
  ts <- from_rdd(
    rdd, schema,
    is_sorted = TRUE, time_unit = "SECONDS", time_column = "t"
  )
} else {
  message("Unable to establish a Spark connection!")
}

sparklyr.flint documentation built on Jan. 11, 2022, 9:06 a.m.