Description Usage Arguments Details Value Examples
View source: R/spark_query_data.R
Query a Spark DataFrame and optionally return the results to Spark memory or to R's memory.
1 | spark_query_data(sc, qry, name, type = c("lazy", "compute", "collect"))
|
sc |
A |
qry |
A SQL query. |
name |
|
type |
|
This function differs depending on the type
given by the user. There are
three scenarios:
The default, "lazy"
, is only evaluated, for example when the user
collects the data (see sparklyr::collect()
).
"compute"
ensures that the query is executed and the resulting data are
stored within Spark's memory.
"collect"
executes the query and returns the resulting data to R's
memory.
One of two:
A tbl_spark
reference to a Spark DataFrame in the event type
is
"compute"
or "lazy"
.
A tibble
in the event type
is "collect"
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | ## Not run:
sc <- sparklyr::spark_connect(master = "local")
mtcars_spark <- sparklyr::copy_to(dest = sc, df = mtcars)
# By default, queries are executed lazily
spark_query_data(sc = sc, qry = "select mpg from mtcars")
# But we can cache the results
cache <- spark_query_data(
sc = sc,
qry = "select mpg from mtcars",
name = "mpg_mtcars",
type = "compute"
)
# And gather the results
spark_collect_data(x = "mpg_mtcars", sc = sc)
# Or we can collect the data instantly
spark_query_data(
sc = sc,
qry = "select disp from mtcars",
type = "collect"
)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.