spark_describe_ext: A more descriptive version of spark describe including...
In GabeChurch/sparkedatools:

Description Usage Arguments Details

This function is especially useful for EDA on large Spark/Hive tables, it is designed to resemble the hist() function in native R. It should be noted that this implementation does differ from native R, and will "bucket" the data-points.
All computation is efficient and distributed in native scala/Spark

It is adivsed to drop time/array/other columns (or those with nested datatypes) before running.

1	spark_describe_ext(sparklyr_table, round_at = 2L)

`sparklyr_table`	is the spark table you will pass to the function. You can pass using a dplyr spark table (tbl).
`round_at`	(default = 2L) controls the number of decimals values to round output counts to (for long outputs)

Important package requirements:
Download the required jar at www.gabechurch.com/sparkEDA (default future integration is in the works)

Example selection of a spark table and graph
spark_table = tbl(sc, sql("select * from db.stock_samples_20m limit 100"))
spark_hist(spark_table, 20L)