knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4.8, fig.align = "center" )
This vignette introduces some basic usage of the R package qqboxplot. The figures below are reproductions of the figures found in "The q-q boxplot" (citation coming soon). We first start by reproducing figures that use the q-q boxplot. The other figures used for comparison in the paper follow after that.
First load the 'qqboxplot' package and packages from the 'tidyverse'.
library(dplyr) library(ggplot2) library(qqboxplot)
The following figure compares simulated t-distributions (and one simulated normal distribution) against a theoretical normal distribution. simulated_data contains to columns, "y" and "group".
"group" specifies the distribution the data ("y") comes from. Note in this figure that reference_dist = "norm" is chosen to specify that the normal distribution should be the reference distribution.
simulated_data %>% ggplot(aes(factor(group, levels=c("normal, mean=2", "t distribution, df=32", "t distribution, df=16", "t distribution, df=8", "t distribution, df=4")), y=y)) + geom_qqboxplot(notch=TRUE, varwidth = TRUE, reference_dist="norm") + xlab("reference: normal distribution") + ylab(NULL) + guides(color=FALSE) + theme(axis.text.x = element_text(angle = 23, size = 15), axis.title.y = element_text(size=15), axis.title.x = element_text(size=15), panel.border = element_blank(), panel.background = element_rect(fill="white"), panel.grid = element_line(colour = "grey70"))
simulated data was created by running the following code:
tibble(y=c(rnorm(1000, mean=2), rt(1000, 16), rt(500, 4), rt(1000, 8), rt(1000, 32)), group=c(rep("normal, mean=2", 1000), rep("t distribution, df=16", 1000), rep("t distribution, df=4", 500), rep("t distribution, df=8", 1000), rep("t distribution, df=32", 1000)))
The following figure shows the same data as the previous figure, but compared against a simulated normal distribution, with mean=5 and variance=1. Note that the reference dataset comparison_dataset
is a separate vector and is not contained in the dataset simulated_data
.
simulated_data %>% ggplot(aes(factor(group, levels=c("normal, mean=2", "t distribution, df=32", "t distribution, df=16", "t distribution, df=8", "t distribution, df=4")), y=y)) + geom_qqboxplot(notch=TRUE, varwidth = TRUE, compdata=comparison_dataset) + xlab("reference: simulated normal dataset") + ylab(NULL) + theme(axis.text.x = element_text(angle = 23, size = 15), axis.title.y = element_text(size=15), axis.title.x = element_text(size=15), panel.border = element_blank(), panel.background = element_rect(fill="white"), panel.grid = element_line(colour = "grey70"))
The vector comparison_dataset
was simulated as follows
rnorm(1000, 5)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.