createBoxplot: Create box plot.

Description Usage Arguments Details Value See Also Examples

View source: R/plotting.R

Description

Create box plot visualization using quartiles calculated with computePercentiles. The simplest case without x value displays single boxplot from the single set of percentiles. To plot multiple box plots and multiple or single box plots with facets use parameters x and/or facet.

Usage

1
2
3
4
5
6
7
createBoxplot(data, x = NULL, fill = x, value = "value", useIQR = FALSE,
  facet = NULL, ncol = 1, facetScales = "fixed", paletteValues = NULL,
  palette = "Set1", title = paste("Boxplots", ifelse(is.null(x), NULL,
  paste("by", x))), subtitle = NULL, xlab = x, ylab = NULL,
  legendPosition = "right", fillGuide = "legend", coordFlip = FALSE,
  baseSize = 12, baseFamily = "sans", defaultTheme = theme_tufte(base_size
  = baseSize, base_family = baseFamily), themeExtra = NULL)

Arguments

data

quartiles precomputed with computePercentiles

x

column name of primary variance. Multiple boxplots are placed along the x-axis. Each value of x must have corresponding percentiles calculated.

fill

name of a column with values to colour box plots

value

column name with percentile value. Usually default 'value' with exception of temporal percentiles that should use 'epoch' value.

useIQR

logical indicates use of IQR interval to compute cutoff lower and upper bounds: [Q1 - 1.5 * IQR, Q3 + 1.5 * IQR], IQR = Q3 - Q1, if FALSE then use maximum and minimum bounds (all values).

facet

vector of 1 or 2 column names to split up data to plot the subsets as facets. If single name then subset plots are placed next to each other, wrapping with ncol number of columns (uses facet_wrap). When two names then subset plots vary on both horizontal and vertical directions (grid) based on the column values (uses facet_grid).

ncol

number of facet columns (applies when single facet column supplied only - see parameter facet).

facetScales

Are scales shared across all subset plots (facets): "fixed" - all are the same, "free_x" - vary across rows (x axis), "free_y" - vary across columns (Y axis, default), "free" - both rows and columns (see in facet_wrap parameter scales )

paletteValues

actual palette colours for use with scale_fill_manual (if specified then parameter palette is ignored)

palette

Brewer palette name - see display.brewer.all in RColorBrewer package for names

title

plot title.

subtitle

plot subtitle.

xlab

a label for the x axis, defaults to a description of x.

ylab

a label for the y axis, defaults to a description of y.

legendPosition

the position of legends. ("left", "right", "bottom", "top", or two-element numeric vector). "none" is no legend.

fillGuide

Name of guide object, or object itself for the fill (when present). Typically "legend" name or object guide_legend.

coordFlip

logical flipped cartesian coordinates so that horizontal becomes vertical, and vertical horizontal (see coord_flip).

baseSize

theme base font size

baseFamily

theme base font family

defaultTheme

plot theme settings with default value theme_tufte. More themes are available here: ggtheme (by ggplot2) and ggthemes.

themeExtra

any additional theme settings that override default theme.

Details

Multiple box plots: x is a name of variable where each value corresponds to a set of percentiles. The boxplots will be placed along the x-axis. Simply use computePercentiles with parameter by="name to be passed in x variable".

Facets: facet vector contains one or two names of vairables where each combination of values corresponds to a set of percentiles. The boxplot(s) will be placed inside separate sections of the plot (facets). Both single boxplot (without variable x and with one) are supported.

Usually, with multiple percentile sets varying along single value use parameter x and add facets on top. The exception is when scale of percentile values differs between each boxplot. Then omit parameter x and use facet with facetScales='free_y'.

Value

ggplot object

See Also

computePercentiles for computing boxplot quartiles

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
if(interactive()){
# initialize connection to Lahman baseball database in Aster 
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};
                         server=<dbhost>;port=2406;database=<dbname>;uid=<user>;pwd=<pw>")

# boxplot of pitching ipouts for AL in 2000s
ipop = computePercentiles(conn, "pitching", columns="ipouts")
createBoxplot(ipop)
                          
# boxplots by the league of pitching ipouts
ipopLg = computePercentiles(conn, "pitching", columns="ipouts", by="lgid")
createBoxplot(ipopLg, x="lgid")

# boxplots by the league with facet yearid of pitching ipouts in 2010s
ipopLgYear = computePercentiles(conn, "pitching", columns="ipouts", by=c("lgid", "yearid"),
                                where = "yearid >= 2010")
createBoxplot(ipopLgYear, x="lgid", facet="yearid", ncol=3)

# boxplot with facets only
bapLgDec = computePercentiles(conn, "pitching_enh", columns="era", by=c("lgid", "decadeid"),
                              where = "lgid in ('AL','NL')")
createBoxplot(bapLgDec, facet=c("lgid", "decadeid"))
}

Example output

Loading required package: RODBC

toaster documentation built on May 30, 2017, 3:51 a.m.