knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 6, fig.height = 4, dpi = 96 )
library(coursekata)
gf_squareplot() creates histograms where individual data points are visible as
stacked unit rectangles. Instead of abstract bars, each observation becomes a
countable square, making sample size and distribution shape tangible.
This is particularly useful for teaching statistical concepts like sampling distributions and hypothesis testing, where students benefit from seeing that "n = 47" means 47 actual squares.
Pass a formula and data frame, just like other gf_* functions:
gf_squareplot(~Thumb, data = Fingers)
The bars parameter controls how the histogram is displayed:
"none" (default): Individual squares only"outline": Squares with bar outlines around each bin"solid": Traditional filled barsgf_squareplot(~Thumb, data = Fingers, bars = "outline")
You can customize fill color, binwidth, and axis limits:
gf_squareplot(~Thumb, data = Fingers, fill = "coral", binwidth = 5, xrange = c(30, 90))
For integer-valued data with a small range, gf_squareplot() automatically
selects a binwidth of 1, so each integer gets its own column:
int_data <- data.frame(rolls = sample(1:6, 30, replace = TRUE)) gf_squareplot(~rolls, data = int_data)
When any bin has more than 75 observations, the function automatically switches
to solid bars to keep the display readable. You can opt into subdivision instead
with auto_subdivide = TRUE, which splits wide bins into sub-columns so
rectangles remain countable:
large_data <- data.frame(x = rnorm(500, mean = 50, sd = 10)) gf_squareplot(~x, data = large_data)
Show a dashed line at the sample mean:
gf_squareplot(~Thumb, data = Fingers, show_mean = TRUE)
The show_dgp = TRUE option adds a teaching overlay for hypothesis testing
contexts. It shows:
set.seed(42) samp_dist <- do(100) * b1(Thumb ~ Height, data = sample(Fingers, 30)) gf_squareplot(~b1, data = samp_dist, show_dgp = TRUE, show_mean = TRUE, xrange = c(-0.5, 1.5), xbreaks = seq(-0.5, 1.5, by = 0.25))
When the input is a factor with numeric levels, all levels are displayed on the x-axis even if some have zero counts:
ratings <- factor(sample(1:5, 20, replace = TRUE, prob = c(1, 2, 4, 2, 1)), levels = 1:5) df <- data.frame(rating = ratings) gf_squareplot(~rating, data = df)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.