This function allows you to efficiently create histograms for all of the columns in a Spark table. It generates the results for plotting on the backend, in SparkIt has a distributed type interpolator method built in, similar to the inferSchema method built into the read_csv in Spark Categorical columns will automatically be detected and plotted. This package is especially useful for EDA on large tables, it closely resembles the hist() function is core R. Requirement 1: You must have an existing sparkContext (sc) initialized to utilize this method. Requirement 2: You must pass the com.gabechurch.sparkeda jar to the spark configuration as follows before connecting. Example: conf$'sparklyr.jars.default'= "/home/gchurch/R/sparkeda_2.11-2.07.jar"
Package details |
|
---|---|
Maintainer | |
License | Apache-2.0 |
Version | 0.0.0.9000 |
Package repository | View on GitHub |
Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.