library(gggeom)
knitr::opts_chunk$set(comment = "#>", collapse = TRUE)
options(digits = 3)

Motivation

The goal of gggeom is to provide a compact way to represent geometric objects and useful tools to wrok with them. This package is (or soon will be) used to power ggvis: if you want to create a new layer function, you'll need to be somewhat familiar with this package. But gggeom is low-level and is not tightly tied to ggvis so you could use it to implement a new graphics system if you wanted.

The objects in gggeom are somewhat similar to the geoms of ggplot2, but gggeom has a much purer take on geometric primitives. For example, ggplot2 has the geom_histogram() which is really a combination of a statistical transformation (stat_bin()) and a bar (geom_bar()). gggeom avoids this muddle, sticking purely to geometric objects. It also provides many more tools for manipulating geometries, indepedent of the particular plot they will eventually generate. The data structures in gggeom are all built on top of data frames, which mean that you can use many existing tools.

All gggeom manipulations should be able to process ~100,000 geometries in less than 0.1. More geometries than that is unlikely to produce a useful plot, and geometry operations should be very cheap because you may need to string multiple together to solve a problem. Very large datasets should summarised (using e.g. the tools ggstat).

Geometric primitives

There are only three fundamental geometric primitives needed to draw any graphic:

However, because these primitives are so general, it is hard to define useful operations on them. So gggeoms provide a number of additional geometric objects that restrict the properties of points, paths and polygons in useful ways:

Geometries are described in turns of their position. When rendered a geometric object will need other properties (like stroke, fill, stroke width, ...) but gggeom concerns itself only computations that involve position.

A geometry is represented as a data frame, where each row corresponds to a single object. You turn a data frame into a geometry using the appropriate render function:

scatter <- iris %>% render_point(~Sepal.Length, ~Sepal.Width) %>% head()
scatter

The default behaviour of the render function preserve all existing columms so that they can be later mapped to other properties of the geometry. However, this is mostly incidental to gggeom - it only works with the position columns (which all end in _ to avoid clashes with other vars).

All geometries inherit from "geom" and "data.frame". Additional subclassing is based not on the appearance of the geom but on the underlying position data. This means that:

All geometries have a plot() method, implemented with base graphics. This is useful for examples, explanation and debugging, but not serious data visualisation. ... is passed on to the underlying base graphic method, so if you're familiar with the graphic parameters, you can tweak the appearance.

plot(scatter)

There are a few render functions that generate existing geometries but with a different parameterisation:

Paths (and polygons, lines and steps)

If each row represents a single object, how are paths, polygons, lines and steps represented? We take advantage of a relatively esoteric R feature - data frame columns can be lists. For example, take a look at the built-in nz data set:

head(nz)
plot(nz)

The x_ and y_ variables are lists of numeric vectors:

nz$x_[[5]]
nz$y_[[5]]

As well as a plot() method, paths also have a points() method which makes it easier to see exactly where the data lie:

class(nz)
nz %>% subset(island == "Stewart") %>% plot() %>% points()

(Note that plot() invisibly returns the input data to make this sort of chaining easy.)

Converting to primitives

You can convert any geometry to its equivalent primitive path by using geom_pointificate(). For example, imagine we have some rects:

df <- data.frame(x = c(1:3, 3), y = c(1:3, 2))
rects <- render_tile(df, ~x, ~y, width = 0.9, height = 0.9)

rects
plot(rects)

We can convert these to four point polygons with geometry_pointificate():

rects %>% geometry_pointificate()
rects %>% geometry_pointificate() %>% plot() %>% points()

The rendering looks similar at first glance, but by using the points() command we can see that each rectangle is composed of four points. The main advantage to converting to polygons is that there are a number of transformations that make sense for polygons, but not for rects:

polys <- rects %>% geometry_pointificate(close = TRUE)
# Rotate each polygon 5 degrees clockwise
polys %>% geometry_rotate(5) %>% plot()
# Transform into polar coordinates
polys %>% geometry_warp("polar", tolerance = 0.0001) %>% plot()

These transformations can not be performed on rects because the result is not a rect; the set of rects is not closed under many useful transformations.

There are also operations that make sense for rects, but not for general polygons. For example, it makes sense to stack rects from the x-axis on up. There's no useful way to stack arbitrary polygons.

rects %>% geometry_stack() %>% plot()

Geometric transformations

You've seen a few geometric transformations above. The following table lists all transformations implemented in gggeom in the rows, and the geometries to which they apply in the columns:

manip <- ls(asNamespace("gggeom"))
manip <- manip[grepl("^geometry_.*?\\.", manip)]
all <- t(simplify2array(strsplit(manip, "\\.")))
all[, 1] <- gsub("geometry_", "", all[, 1])
all[, 2] <- gsub("geom_", "", all[, 2])

tbl <- table(all[, 1], all[, 2])
tbl[tbl[, "geom"] == 1, ] <- 1
tbl <- tbl[, -2]

tbl[tbl == 0] <- ""
tbl[tbl == 1] <- "*"
noquote(tbl)

The following sections categorise the transformations in to families, and show a few examples. See the documentation for more details of how they operate.

Linear transformations

# Special cases
polys %>% geometry_reflect() %>% plot()
polys %>% geometry_rotate(15) %>% plot()
polys %>% geometry_flip() %>% plot()

# Arbitrary
shear <- matrix(c(1, 0, 0.75, 1), nrow = 2)
polys %>% geometry_transform(shear) %>% plot()

Add random jitter

scatter_ex %>% plot()
scatter_ex %>% plot() %>% geometry_jitter() %>% points(col = "red")

All geometries are jittered as a whole - i.e. all x coordinates are displaced by the same random amount, and as are all y coordinates. This ensures (e.g.) the rects remain as rects:

rects %>% geometry_jitter() %>% plot()
polys %>% geometry_jitter() %>% plot()

Stacking, dodging and (re)scaling

rects %>% geometry_stack() %>% plot()


rstudio/gggeom documentation built on May 28, 2019, 4:35 a.m.