GabeChurch/sparkedatools:

This function allows you to efficiently create histograms for all of the columns in a Spark table. It generates the results for plotting on the backend, in SparkIt has a distributed type interpolator method built in, similar to the inferSchema method built into the read_csv in Spark Categorical columns will automatically be detected and plotted. This package is especially useful for EDA on large tables, it closely resembles the hist() function is core R. Requirement 1: You must have an existing sparkContext (sc) initialized to utilize this method. Requirement 2: You must pass the com.gabechurch.sparkeda jar to the spark configuration as follows before connecting. Example: conf$'sparklyr.jars.default'= "/home/gchurch/R/sparkeda_2.11-2.07.jar"

README.md

Vignettes Man pages API and functions Files

Package details
Maintainer
License	Apache-2.0
Version	0.0.0.9000
Package repository	View on GitHub
Installation	Install the latest version of this package by entering the following in R: `install.packages("remotes") remotes::install_github("GabeChurch/sparkedatools")`