Read in SAS data in parallel into Spark"
In spark.sas7bdat: Read in 'SAS' Data ('.sas7bdat' Files) into 'Apache Spark'

This R package allows R users to easily import large SAS datasets into Spark tables in parallel.

The package uses the spark-sas7bdat Spark package in order to read a SAS dataset in Spark. That Spark package imports the data in parallel on the Spark cluster using the Parso library and this process is launched from R using the sparklyr functionality.

More information about the spark-sas7bdat Spark package and sparklyr can be found at:

https://spark-packages.org/package/saurfang/spark-sas7bdat and https://github.com/saurfang/spark-sas7bdat
https://github.com/sparklyr/sparklyr

Example

The following example reads in a file called iris.sas7bdat in parallel in a table called sas_example in Spark. Do try this with bigger data on your cluster and look at the help of the sparklyr package to connect to your Spark cluster.

library(sparklyr)
library(spark.sas7bdat)
mysasfile <- system.file("extdata", "iris.sas7bdat", package = "spark.sas7bdat")

sc <- spark_connect(master = "local")
x <- spark_read_sas(sc, path = mysasfile, table = "sas_example")

The resulting pointer to a Spark table can be further used in dplyr statements. These will be executed in parallel using the Spark functionalities of the spark-sas7bdat package.

library(dplyr)
library(magrittr)
x %>% group_by(Species) %>%
  summarise(count = n(), length = mean(Sepal_Length), width = mean(Sepal_Width))

Support in big data and Spark analysis

Need support in big data and Spark analysis? Contact BNOSAC: http://www.bnosac.be

Any scripts or data that you put into this service are public.

spark.sas7bdat documentation built on April 19, 2021, 9:07 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

spark.sas7bdat
Read in 'SAS' Data ('.sas7bdat' Files) into 'Apache Spark'

Read in SAS data in parallel into Spark"
In spark.sas7bdat: Read in 'SAS' Data ('.sas7bdat' Files) into 'Apache Spark'

Example

Support in big data and Spark analysis

Try the spark.sas7bdat package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

spark.sas7bdat Read in 'SAS' Data ('.sas7bdat' Files) into 'Apache Spark'

Read in SAS data in parallel into Spark" In spark.sas7bdat: Read in 'SAS' Data ('.sas7bdat' Files) into 'Apache Spark'

Example

Support in big data and Spark analysis

Try the spark.sas7bdat package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

spark.sas7bdat
Read in 'SAS' Data ('.sas7bdat' Files) into 'Apache Spark'

Read in SAS data in parallel into Spark"
In spark.sas7bdat: Read in 'SAS' Data ('.sas7bdat' Files) into 'Apache Spark'