eval_sparklyr <- FALSE if(Sys.getenv("GLOBAL_EVAL") != "") eval_sparklyr <- Sys.getenv("GLOBAL_EVAL")
sparklyr
library(dplyr) library(sparklyr)
Learn to open a new Spark session
Load the sparklyr
library
r
library(sparklyr)
Use spark_connect()
to create a new local Spark session
r
sc <- spark_connect(master = "local")
Click on the Spark
button to view the current Spark session's UI
Click on the Log
button to see the message history
Practice uploading data to Spark
Load the dplyr
library
r
library(dplyr)
Copy the mtcars
dataset into the session
r
spark_mtcars <- copy_to(sc, mtcars, "my_mtcars")
In the Connections pane, expande the my_mtcars
table
Go to the Spark UI, note the new jobs
In the UI, click the Storage button, note the new table
Click on the In-memory table my_mtcars link
dplyr
See how Spark handles dplyr
commands
Run the following code snipett
r
spark_mtcars %>%
group_by(am) %>%
summarise(mpg_mean = mean(mpg, na.rm = TRUE))
Go to the Spark UI and click the SQL button
Click on the top item inside the Completed Queries table
At the bottom of the diagram, expand Details
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.