pp_plot: Produces the paired probability plot for two groups
In DJAnderson07/esvis: Visualization and Estimation of Effect Sizes

Description Usage Arguments Value Examples

The paired probability plot maps the probability of obtaining a specific score for each of two groups. The area under the curve (auc) corresponds to the probability that a randomly selected observation from the x-axis group will have a higher score than a randomly selected observation from the y-axis group. This function extends the basic pp-plot by allowing multiple curves and faceting to facilitate a variety of comparisons. Note that because the plotting is built on top of ggplot2, additional customization can be made on top of the plots, as illustrated in the examples.

pp_plot(
  data,
  formula,
  ref_group = NULL,
  cuts = NULL,
  cut_labels = TRUE,
  cut_label_x = 0.02,
  cut_label_size = 3,
  lines = TRUE,
  linetype = "solid",
  linewidth = 1.1,
  shade = TRUE,
  shade_alpha = 0.2,
  refline = TRUE,
  refline_col = "gray40",
  refline_type = "dashed",
  refline_width = 1.1
)

`data`	The data frame to be plotted
`formula`	A formula of the type `out ~ group` where `out` is the outcome variable and `group` is the grouping variable. Note this variable can include any arbitrary number of groups. Additional variables can be included with `+` to produce separate plots by the secondary or tertiary variable of interest (e.g., `out ~ group + characteristic1 + characteristic2`). No more than two additional characteristics can be supplied at this time.
`ref_group`	Optional character vector (of length 1) naming the reference group. Defaults to the group with the highest mean score.
`cuts`	Integer. Optional vector (or single number) of scores used to annotate the plot. If supplied, line segments will extend from the corresponding x and y axes and meet at the PP curve.
`cut_labels`	Logical. Should the reference lines corresponding to `cuts` be labeled? Defaults to `TRUE`.
`cut_label_x`	The x-axis location of the cut labels. Defaults to 0.02.
`cut_label_size`	The size of the cut labels. Defaults to 3.
`lines`	Logical. Should the PP Lines be plotted? Defaults to `TRUE`.
`linetype`	The linetype for the PP lines. Defaults to "solid".
`linewidth`	The width of the PP lines. Defaults to 1.1 (just marginally larger than the default ggplot2 lines).
`shade`	Logical. Should the area under the curve be shaded? Defaults to `TRUE`.
`shade_alpha`	Transparency of the shading. Defaults to 0.2.
`refline`	Logical. Should a diagonal reference line be plotted, representing the value at which no difference is observed between the reference and focal distributions? Defaults to `TRUE`.
`refline_col`	Color of the reference line. Defaults to a dark gray.
`refline_type`	The linetype for the reference line. Defaults to "dashed".
`refline_width`	The width of the reference line. Defaults to 1, or just slightly thinner than the PP lines.

A ggplot2 object displaying the specified PP plot.

# PP plot examining differences by condition
pp_plot(star, math ~ condition)

# The sample size gets very small in the above within cells (e.g., wild 
# changes within the "other" group in particular). Overall, the effect doesn't
# seem to change much by condition.

# Look at something a little more interesting
## Not run: 
pp_plot(benchmarks, math ~ ell + season + frl)

## End(Not run)
# Add some cut scores
pp_plot(benchmarks, math ~ ell, cuts = c(190, 210, 215))

## Make another interesting plot. Use ggplot to customize
## Not run: 
library(tidyr)
library(ggplot2)
benchmarks %>% 
  gather(subject, score, reading, math) %>% 
  pp_plot(score ~ ell + subject + season,
          ref_group = "Non-ELL") +
  scale_fill_brewer(name = "ELL Status", palette = "Pastel2") +
  scale_color_brewer(name = "ELL Status", palette = "Pastel2") +
  labs(title = "Differences among English Language Learning Groups",
       subtitle = "Note crossing of reference line") +
  theme_minimal()

## End(Not run)