createHistogram: Create histogram type of plot.

Description Usage Arguments Value See Also Examples

Description

Create histogram plot from the pre-computed distribution of data. Parameter data is a data frame containing intervals (bins) and counts obtained using computeHistogram or computeBarchart).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
createHistogram(data, x = "bin_start", y = "bin_count", fill = NULL,
  position = "dodge", facet = NULL, ncol = 1, facetScales = "free_y",
  baseSize = 12, baseFamily = "", xlim = NULL, breaks = NULL,
  text = FALSE, percent = FALSE, digits = 0, textVJust = -2,
  mainColour = "black", fillColour = "grey", scaleGradient = NULL,
  paletteValues = NULL, palette = "Set1", trend = FALSE,
  trendLinetype = "solid", trendLinesize = 1, trendLinecolour = "black",
  title = paste("Histgoram by", fill), subtitle = NULL, xlab = x,
  ylab = y, legendPosition = "right", coordFlip = FALSE,
  defaultTheme = theme_tufte(base_size = baseSize, base_family = baseFamily),
  themeExtra = NULL)

Arguments

data

data frame contains computed histogram

x

name of a column containing bin labels or interval values

y

name of a column containing bin values or counts (bin size)

fill

name of a column with values to colour bars

position

histogram position parameter to use for overlapping bars: stack, dodge (defult), fill, identity

facet

vector of 1 or 2 column names to split up data to plot the subsets as facets. If single name then subset plots are placed next to each other, wrapping with ncol number of columns (uses facet_wrap). When two names then subset plots vary on both horizontal and vertical directions (grid) based on the column values (uses facet_grid).

ncol

number of facet columns (applies when single facet column supplied only - see parameter facet).

facetScales

Are scales shared across all subset plots (facets): "fixed" - all are the same, "free_x" - vary across rows (x axis), "free_y" - vary across columns (Y axis, default), "free" - both rows and columns (see in facet_wrap parameter scales )

baseSize

theme base font size

baseFamily

theme base font family

xlim

a character vector specifying the data range for the x scale and the default order of their display in the x axis.

breaks

a character vector giving the breaks as they should appear on the x axis.

text

if TRUE then display values above bars (default: FALSE) (this feature is in development)

percent

format text as percent

digits

number of digits to use in text

textVJust

vertical justificaiton of text labels (relative to the top of bar).

mainColour

Perimeter color of histogram bars

fillColour

Fill color of histogram bars (applies only when fill is NULL)

scaleGradient

control ggplot2 scale fill gradient manually, e.g use scale_colour_gradient (if specified then parameter palette is ignored)

paletteValues

actual palette colours for use with scale_fill_manual (if specified then parameter palette is ignored)

palette

Brewer palette name - see display.brewer.all in RColorBrewer package for names

trend

logical indicates if trend line is shown.

trendLinetype

trend line type.

trendLinesize

size of trend line.

trendLinecolour

color of trend line.

title

plot title.

subtitle

plot subtitle.

xlab

a label for the x axis, defaults to a description of x.

ylab

a label for the y axis, defaults to a description of y.

legendPosition

the position of legends. ("left", "right", "bottom", "top", or two-element numeric vector). "none" is no legend.

coordFlip

logical flipped cartesian coordinates so that horizontal becomes vertical, and vertical horizontal (see coord_flip).

defaultTheme

plot theme settings with default value theme_tufte. More themes are available here: ggtheme (by ggplot2) and ggthemes.

themeExtra

any additional theme settings that override default theme.

Value

ggplot object

See Also

computeHistogram and computeBarchart to compute data for histogram

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
if(interactive()){
# initialize connection to Lahman baseball database in Aster 
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};
                         server=<dbhost>;port=2406;database=<dbname>;uid=<user>;pwd=<pw>")

# AL teams pitching stats by decade
bc = computeBarchart(channel=conn, tableName="pitching_enh", category="teamid", 
                     aggregates=c("AVG(era) era", "AVG(whip) whip", "AVG(ktobb) ktobb"),
                     where="yearid >= 1990 and lgid='AL'", by="decadeid", withMelt=TRUE)

createHistogram(bc, "teamid", "value", fill="teamid", 
                facet=c("variable", "decadeid"), 
                legendPosition="bottom",
                title = "AL Teams Pitching Stats by decades (1990-2012)",
                themeExtra = guides(fill=guide_legend(nrow=2)))

# AL Teams Average Win-Loss Difference by Decade 
franchwl = computeBarchart(conn, "teams_enh", "franchid",
                           aggregates=c("AVG(w) w", "AVG(l) l", "AVG(w-l) wl"),
                           by="decadeid",
                           where="yearid >=1960 and lgid = 'AL'")

createHistogram(franchwl, "decadeid", "wl", fill="franchid",
                facet="franchid", ncol=5, facetScales="fixed",
                legendPosition="none",
                trend=TRUE,
                title="Average W-L difference by decade per team (AL)",
                ylab="Average W-L")  
                
# Histogram of team ERA distribution: Rangers vs. Yankees in 2000s
h2000s = computeHistogram(channel=conn, tableName='pitching_enh', columnName='era',
                          binsize=0.2, startvalue=0, endvalue=10, by='teamid',
                          where="yearID between 2000 and 2012 and teamid in ('NYA','TEX')")
createHistogram(h2000s, fill='teamid', facet='teamid', 
                title='TEX vs. NYY 2000-2012', xlab='ERA', ylab='count',
                legendPosition='none')                
                
}

teradata-aster-field/toaster documentation built on May 31, 2019, 8:36 a.m.