Description Usage Arguments Details Value See Also Examples
Convert an R data frame (containing a panel dataset, where rows are observations and columns are time periods) into an Edward Tufte-inspired Slopegraph using ggplot2
1 2 3 4 5 6 7 8 | ggslopegraph(data, main = NULL, xlab = "", ylab = "",
xlabels = names(data), xlim = c(-1L, ncol(data) + 2L),
ylim = range(data, na.rm = TRUE), labpos.left = 0.8,
labpos.right = ncol(data) + 0.2, leftlabels = NULL, rightlabels = NULL,
xbreaks = seq_along(xlabels), ybreaks = NULL, yrev = ylim[1] > ylim[2],
decimals = 0L, col.lines = "black", col.lab = "black",
col.num = "black", lwd = 0.5, offset.x = NULL, cex.lab = 3L,
cex.num = 3L, na.span = FALSE)
|
data |
An observation-by-period data.frame, with at least two columns. Missing values are allowed. |
main |
A character string specifying a title. Passed to |
xlab |
A character string specifying an x-axis label. Passed to |
ylab |
A character string specifying an y-axis label. Passed to |
xlabels |
The labels to use for the slopegraph periods. Default is |
xlim |
A two-element numeric vector specifying the y-axis limits. |
ylim |
A two-element numeric vector specifying the y-axis limits. |
labpos.left |
A numeric value specifying the x-axis position of the left-side observation labels. If |
labpos.right |
A numeric value specifying the x-axis position of the right-side observation labels. If |
leftlabels |
The parameter for the rightside observation labels. Default is using row indexes. |
rightlabels |
The parameter for the rightside observation labels. Default is using row indexes. |
xbreaks |
Passed to |
ybreaks |
Passed to |
yrev |
A logical indicating whether to use |
decimals |
The number of decimals to display for values in the plot. Default is |
col.lines |
A vector of colors for the slopegraph lines. Default is |
col.lab |
A vector of colors for the observation labels. Default is |
col.num |
A vector of colors for the number values. Default is |
lwd |
A vector of line width values for the slopegraph lines. |
offset.x |
A small offset for |
cex.lab |
A numeric value indicating the size of row labels. Default is |
cex.num |
A numeric value indicating the size of numeric labels. Default is |
na.span |
A logical indicating whether line segments should span periods with missing values. The default is |
A slopegraph is an interesting visualization because it involves the representation of a simple observation-by-period matrix of data values as a plot but the production of that visualization entails a number of data transformations that are not immediately obvious from the visual simplicity of the graph itself.
Specifically, a slopegraph involves three distinct visual components, each of which must be drawn using a slightly different data structure. Those elements are: (1) the observation labels, (2) the numeric value labels of each observation-period data point, and (3) the line segments connecting the numeric labels. To draw these three elements requires transforming the input into three different data structures.
First, to draw the observation labels requires constructing a new data frame containing the observation labels (from the input data frame's rownames
attribute), the constant x-left and x-right label positions, and the vertical positions of the left- and right-side labels.
Second, to draw the numeric value labels requires creating a “tidy” data frame based upon the positions of the values in the input data frame. Specifically, a tidy representation of the data is a two-column data frame containing: (1) the column value of each data point (identified by col
) to specify horizontal position, and (2) the value of the data point itself which is also its vertical position. This consists of a basic wide-to-long reshape procedure (using reshape
).
Third, to draw the line segments requires creating a “tidy” data frame that consists of one row for each segment, by identifying row-adjacent values and identifying variables for x1 and x2 and y1 and y2 end-points of each segment. Another “row” identifying variable is needed to relationally map this data frame back to the original observations (e.g., to color the segments). This step is performed by segmentize
.
A ggplot
object.
For a base graphics version, use slopegraph
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | require("ggplot2")
## Tufte's Cancer Graph (to the correct scale)
data(cancer)
ggslopegraph(cancer, col.lines = 'gray',
xlabels = c('5 Year','10 Year','15 Year','20 Year'))
## Tufte's GDP Graph
data(gdp)
ggslopegraph(gdp, col.line='gray', xlabels = c('1970','1979'),
main = 'Current Receipts of Goverment\nas a Percentage of Gross Domestic Product') +
theme_bw()
## Ranking of U.S. State populations
data(states)
ggslopegraph(states,
main = 'Relative Rank of U.S. State Populations, 1790-1870',
yrev = TRUE)
cls <- rep("black", nrow(states))
cls[rownames(states) == "South Carolina"] <- "red"
cls[rownames(states) == "Tennessee"] <- "blue"
ggslopegraph(states, main = 'Relative Rank of U.S. State Populations, 1790-1870',
yrev = TRUE, col.lines = cls, col.lab = cls)
## ranking of U.S. Bachelors Degrees fields
data(bachelors)
bachelors[] <- lapply(bachelors, function(x) rank(x))
names(bachelors) <- substring(names(bachelors), 3, 7)
ggslopegraph(bachelors, offset.x = 0, xlim = c(1, 25), col.num = NA, labpos.left = NULL)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.