Description Usage Arguments Details
You will need to reduce the range to around 200 per variable to get effective plots for this method at the moment, as bucketing is not supported (yet).
1 2 | spark_plot_overlay_pct(sparklyr_table, response_var,
max_numeric_ticks = 40)
|
sparklyr_table |
is the sparklyr table to pass to the function |
response_var |
is the string response variable you want to overlay the histograms with. |
max_numeric_ticks |
40 is the default, using over 40 is fine but you should increase the output width using knitR. |
You must have sparklyr, ggplot2, and purrr installed
You must also have the sparkeda jar installed and referenced the same way as spark_hist
You can change the plot output sizes with the chunk settings using knitR like r fig.height=8, fig.width=20
Example selection of a spark table and plot generation
adult_df = tbl(sc, sql("select * from sample_data.adult_dataset"))
spark_hist_overlay(adult_df, "income"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.