tplot: tplot

View source: R/plot.R

tplotR Documentation

tplot

Description

An alternative to boxplot with additional annotations, hypothesis testing, and panel expressions. The individual data points can be shown (either in the foreground or background) with point-dodging. Violin plots with optional boxplots (and/or points) may also be shown.

Usage

tplot(x, ...)

## S3 method for class 'formula'
tplot(
  formula,
  data = NULL,
  ...,
  subset,
  na.action = NULL,
  panel.first = NULL,
  panel.last = NULL
)

## Default S3 method:
tplot(
  x,
  g,
  ...,
  type = "db",
  jit = NULL,
  dist = NULL,
  dist.n = Inf,
  args.beeswarm = list(),
  main = NULL,
  sub = NULL,
  xlab = NULL,
  ylab = NULL,
  xlim = NULL,
  ylim = NULL,
  names,
  col = NULL,
  group.col = TRUE,
  bg = NA,
  group.bg = TRUE,
  pch = par("pch"),
  group.pch = TRUE,
  cex = par("cex"),
  group.cex = FALSE,
  boxcol = "grey90",
  bordercol = par("fg"),
  median.line = FALSE,
  mean.line = FALSE,
  median.pars = list(),
  mean.pars = list(),
  boxplot.pars = list(),
  quantiles = NULL,
  show.n = TRUE,
  show.na = show.n,
  cex.n = par("cex"),
  text.na = "missing",
  n.at = NULL,
  test = FALSE,
  args.test = list(),
  format_pval = TRUE,
  ann = par("ann"),
  axes = TRUE,
  frame.plot = axes,
  add = FALSE,
  at = NULL,
  horizontal = FALSE,
  panel.first = NULL,
  panel.last = NULL
)

Arguments

x

a numeric vector or a single list containing such vectors

...

for the formula method, named arguments to be passed to the default method

for the default method, graphical parameters passed to par

formula

a formula, such as y ~ grp, where y is a numeric vector of data values to be split into groups according to the grouping variable grp (usually a factor)

data

a data frame (or list) from which the variables in formula should be taken

subset

an optional vector specifying a subset of observations to be used for plotting

na.action

a function which indicates what should happen when the data contain NAs; the default is to ignore missing values in either the response or the group

panel.first

an expression to be evaluated after the plot axes are set up but before any plotting takes place; this can be useful for drawing background grids or scatterplot smooths; note that this works by lazy evaluation: passing this argument from other plot methods may well not work since it may be evaluated too early; see also plot.default

panel.last

an expression to be evaluated after plotting has taken place but before the axes, title, and box are added; see the comments about panel.first

g

a vector or factor object giving the group for the corresponding elements of x, ignored with a warning if x is a list

type

type of plot ("d" for dot, "db" for dot-box, "bd" for box-dot, or "b" box; "v" may be used instead of "b" for a violin plot); see examples for all options

jit, dist, dist.n

jitter parameters for overlapping points (use 0 for no jitter (i.e., points may overlap) and values > 0 for more distance between points); both can be length 1 (recycled as needed for groups) or equal to the number of groups (useful if one group has more points and needs more jittering than other groups)

jit controls the amount of spread in a group of neighboring points, and dist controls the size of the interval to group neighboring points, i.e., a group of sequential points that are no more than dist apart are considered neighbors and will be jittered

dist.n is the maximum number of points allowed in each group of neighboring points, useful for limiting the spread of points

args.beeswarm

logical or a named list of arguments passed to beeswarm; if NULL (default) or FALSE, beeswarm is not used; if TRUE, beeswarm is used with pre-set defaults (i.e., method = 'center' and horizontal to match the tplot setting); passing a list of arguments will add or override arguments except that tplot will not adjust data values as beeswarm methods may

main, sub

overall title and sub-title (below x-axis) for the plot

xlab, ylab

x- and y-axis labels

xlim, ylim

x- and y-axis limits

names

group labels

col, bg, pch, cex

plotting fill color, border color (if applicable), character, and character expansion value; note for pch = 21:25 bg is the fill while col is the border color

group.col, group.bg, group.pch, group.cex

logical; if TRUE, apply col, bg, pch, and/or cex by group in which case only the first n values are used where n is the number of groups; otherwise, points are treated individually by order (recycled as needed)

boxcol, bordercol

box fill and border colors

median.line, mean.line

logical; draw median, mean lines

median.pars, mean.pars

lists of graphical parameters for median and mean lines

boxplot.pars

additional list of graphical parameters for box plots (or violin plots)

quantiles

for violin plots, probabilities for quantile lines (as an alternative to box plots); note lwd/lty may be passed to control the quantile lines

show.n, show.na

logical; show total and missing in each group

cex.n

character expansion for show.n and show.na

text.na

label for missing values (default is "missing")

n.at

the y-coordinate (or x-coordinate if horizontal = TRUE) to place the total and missing in each group

test

logical or function; if TRUE, a rank-sum p-value is added to the plot (wilcox.test or kruskal.test based on the number of groups)

alternatively, a function can be used, e.g., test = cuzick.test or function(x, g) cuzick.test(x, g); note that if test is a function, it must have at least two arguments of numeric data values and group

args.test

an optional named list of mtext arguments controlling the test text

format_pval

logical; if TRUE, p-values are formatted with pvalr; if FALSE, no formatting is performed; alternatively, a function can be passed which should take a numeric value and return a character string (or a value to be coerced) for printing

ann

logical; annotate plot

axes

logical; draw axes

frame.plot

logical; draw box around x-y plot

add

logical; add to an existing plot

at

the x-axis group positions

horizontal

logical; flip axes

Value

A list with the following elements (see boxplot:

$stats

a matrix, each column contains the extreme of the lower whisker, the lower hinge, the median, the upper hinge and the extreme of the upper whisker for one group/plot. If all the inputs have the same class attribute, so will this component

$n

a vector with the number of observations in each group.

$conf

a matrix where each column contains the lower and upper extremes of the notch

$out

the values of any data points which lie beyond the extremes of the whiskers

$group

a vector of the same length as out whose elements indicate to which group the outlier belongs

$names

a vector of names for the groups

additionally, tplot returns the following:

$test

the object returned by the test function

$coords

a list for each group of data frames containing the x- and y-coordinates of the points

See Also

Tatsuki tplot; web app for Tatsuki tplot; boxplot; jmplot

Examples

x <- mtcars$mpg
g <- interaction(mtcars$gear, mtcars$vs)

## these are equivalent ways to call tplot
tplot(x, g)
tplot(split(x, g))
tplot(x ~ g)
tplot(mpg ~ gear + vs, mtcars)


## tplot returns the point coordinates for later use
co <- tplot(mpg ~ vs, mtcars)
sapply(co$coords, function(x)
  points(x, pch = 16L, col = findInterval(x$y, fivenum(x$y)) + 1L))


## group.{col,bg,pch,cex} can be used to achieve this directly
tplot(mpg ~ vs, mtcars, pch = 21L, col = 'black', group.bg = FALSE,
  bg = ave(mpg, vs, FUN = function(x)
    findInterval(x, fivenum(x)) + 1L))


## options for box, violin, dots
types <- c('d', 'db', 'bd', 'b', 'v', 'vd', 'dv', 'dbv', 'bv', 'n')
l <- lapply(types, function(...) mtcars$mpg)
tplot(l, type = types, names = types, xlab = 'tplot(x, type = ...)')


## horizontal plots may cut off show.n/show.na text
tplot(x, g, horizontal = TRUE)

op <- par(mar = par('mar') + c(0, 0, 0, 2))
tplot(x, g, horizontal = TRUE)

## and/or rotate labels
tplot(x, g, horizontal = TRUE, srt = 45)
par(op)


## add rank-sum or custom test to plot
tplot(mpg ~ vs, mtcars, test = TRUE)   ## two groups - wilcox.test
tplot(mpg ~ gear, mtcars, test = TRUE) ## >=2 groups - kruskal.test
tplot(mpg ~ gear, mtcars, test = rawr::cuzick.test) ## trend test

## custom test/text formatting
tplot(mtcars$mpg, 1:2, test = function(x, g)
  wilcox.test(x ~ g, data.frame(x, g), exact = FALSE, paired = TRUE),
  args.test = list(col = 2, at = 1.5, adj = 0.5, line = -3, cex = 2))


## tplot has the same return value as boxplot with addition elements
## for the test/coordinates if applicable
identical(
  within(tplot(mtcars$mpg), {test <- coords <- NULL}),
  within(boxplot(mtcars$mpg), {test <- coords <- NULL})
)


## use panel.first/panel.last like in `plot` (unavailable in `boxplot`)
tplot(
  mpg ~ gear, data = mtcars, col = 1:3, type = 'd', show.na = FALSE,
  cex = c(1, 5)[(mtcars$mpg > 30) + 1L],
  panel.last = legend('topleft', legend = 3:5, col = 1:3, pch = 1),
    panel.first = {
      rect(1.5, par('usr')[3], 2.5, par('usr')[4], col = 'cyan', border = NA)
      abline(h = mean(mtcars$mpg))
      abline(h = 1:6 * 5 + 5, lty = 'dotted', col = 'grey70')
    }
)


## beeswarm options
x <- rnorm(1000)
tplot(
  x, type = 'd',
  args.beeswarm = list(method = 'square', corral = 'gutter', corralWidth = 0.25)
)

## compare (note that tplot **does not** change the original data values)
beeswarm::beeswarm(x, method = 'square', corral = 'gutter', corralWidth = 0.25)


## example with missing data
set.seed(1)
dat <- data.frame(
  age   = replace(rnorm(80, rep(c(26, 36), c(70, 10)), 4), 1:5, NA),
  sex   = factor(sample(c('Female', 'Male'), 80, TRUE)),
  group = paste0('Group ', sample(1:4, 80, TRUE, prob = c(2, 5, 4, 1)))
)

tplot(
  age ~ group, data = dat, las = 1, bty = 'l', names = LETTERS[1:4],
  text.na = 'n/a', ## default is 'missing'
  type = c('db', 'dv', 'dbv', 'd'),
  ## options for violin types
  quantiles = c(0.25, 0.5, 0.75), lwd = c(0.5, 2, 0.5),
  ## one pch per group
  group.pch = TRUE, pch = c(15, 17, 19, 8),
  ## color by variable not group
  group.col = FALSE, col = c('darkred', 'darkblue')[sex],
  boxcol = c('lightsteelblue1', 'lightyellow1', grey(0.9)),
  boxplot.pars = list(notch = TRUE, boxwex = 0.5)
)
legend(
  par('usr')[1], par('usr')[3], xpd = NA, bty = 'n',
  legend = levels(dat$sex), col = c('darkred', 'darkblue'), pch = 19
)


raredd/rawr documentation built on March 4, 2024, 1:36 a.m.