| db_compute_boxplot | R Documentation |
Uses dplyr operations to create boxplot calculations. Because of this approach, the calculations automatically run inside the database if 'data' has a database or sparklyr connection. The 'class()' of such tables in R are: tbl_sql, tbl_dbi, tbl_spark
Requires database support for percentile/quantile functions. Confirmed to work with:
DuckDB (recommended for local examples) - uses quantile()
Spark/Hive (via sparklyr) - uses percentile_approx()
SQL Server (2012+) - uses PERCENTILE_CONT()
PostgreSQL (9.4+) - uses percentile_cont()
Oracle (9i+) - uses PERCENTILE_CONT()
Does NOT work with SQLite, MySQL < 8.0, or MariaDB (no percentile support).
Note that this function supports input tbl that already contains grouping variables. This can be useful when creating faceted boxplots.
db_compute_boxplot(data, x, var, coef = 1.5)
data |
A table (tbl) that can already contain grouping variables |
x |
A discrete variable in which to group the boxplots |
var |
A continuous variable |
coef |
Length of the whiskers as multiple of IQR. Defaults to 1.5 |
An ungrouped data.frame with boxplot statistics including columns for the grouping variable(s), quartiles (lower, middle, upper), whisker bounds (ymin, ymax), and the count of observations per group.
## Not run:
library(DBI)
library(dplyr)
con <- dbConnect(duckdb::duckdb(), ":memory:")
db_mtcars <- copy_to(con, mtcars, "mtcars")
db_mtcars |>
db_compute_boxplot(am, mpg)
dbDisconnect(con)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.