In This Vignette

Finding and Formatting

This vignette explains how to find parts of a pivot table - either one or more data groups (i.e. row/column headings) or one or more cells in the body of the pivot table.

This is often useful to retrieve either a specific value/values, or to change the appearance of specific headings/cells - similar to the conditional formatting capabilities of many off-the-shelf tools.

Many of the examples in this vignette use the setStyling() method to format data groups or cells. This method is described in the Styling vignette.

A couple of the examples in this vignette use the mapStyling() method to format cells based on their value. This method is also described in the Styling vignette.

This is a long vignette. Some readers may prefer to jump to the summary at the end for a more concise description of the functions and frequently used parameters.

Example Pivot Table

The following pivot table is used as the basis of the examples in the rest of this vignette:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$renderPivot()

Finding Headings

The findRowDataGroups() and findColumnDataGroups() functions are used to find data groups (i.e. row and/or column headings) that match specific criteria. The functions return a list of data group objects.

These functions can operate in two different ways, specified by the matchMode argument.

matchMode="simple" is used when matching only one variable, e.g. TrainCategory="Express Passenger".

matchMode="combinations" is used when matching for combinations of variables, e.g. TrainCategory="Express Passenger" and PowerType="DMU", which would return the "DMU" data group underneath "Express Passenger" (but not the "DMU" data group underneath "Ordinary Passenger"). Examples of each follow below.

These functions also accept the following arguments:

Several examples follow below. In each of the examples the data groups that have been found are highlighted in yellow by specifying a different style.

The examples in this section use column data groups, but all are equally applicable to row data groups.

Finding headings when matchMode="simple"

The following examples illustrate how the "simple" matching mode works. "simple" match mode is the default.

From variable name(s) (variableNames)

Find all of the data groups for the "TrainCategory" variable:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(variableNames="TrainCategory")
pt$setStyling(groups=groups, declarations=list("background-color"="#FFFF00"))
pt$renderPivot()

From variable value(s) (variableValues)

Find all of the data groups for the "PowerType" variable with the values "DMU" and "HST":

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(variableValues=list("PowerType"=c("DMU", "HST")))
pt$setStyling(groups=groups, declarations=list("background-color"="#FFFF00"))
pt$renderPivot()

totals

Exclude totals:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(variableNames="TrainCategory", totals="exclude")
pt$setStyling(groups=groups, declarations=list("background-color"="#FFFF00"))
pt$renderPivot()

Find only totals:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(variableNames="TrainCategory", totals="only")
pt$setStyling(groups=groups, declarations=list("background-color"="#FFFF00"))
pt$renderPivot()

includeDescendantGroups

Find all of the data groups for the "TrainCategory" variable with the value "Ordinary Passenger", including the descendant data groups:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(
  variableValues=list("TrainCategory"="Ordinary Passenger"), 
  includeDescendantGroup=TRUE)
pt$setStyling(groups=groups, declarations=list("background-color"="#FFFF00"))
pt$renderPivot()

Selecting a grand total data group

To select the right-most/bottom total data group:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(
  variableValues=list("TrainCategory"="**"), 
  includeDescendantGroup=TRUE)
pt$setStyling(groups=groups, declarations=list("background-color"="#FFFF00"))
pt$renderPivot()

Finding headings when matchMode="combinations"

The following examples illustrate how the "combinations" matching mode works.

The key concept to understand here is that the filtering criteria (i.e. the variableName(s) and variableValues) set for a data group also apply to all descendant data groups. For example, in the example pivot table above, the "DMU" under "Express Passenger" effectively means WHERE ("TrainCategory"="Express Passenger") AND ("PowerType"="DMU").

From variable name(s) (variableNames)

Find all of the data groups that have filter criteria specified for both the "TrainCategory" and "Power Type" variables:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(matchMode="combinations",
                                  variableNames=c("TrainCategory", "PowerType"))
pt$setStyling(groups=groups, declarations=list("background-color"="#00FFFF"))
pt$renderPivot()

In the example above, the first row of headings relates only to the "TrainCategory" variable. The second row of headings relates both to the "PowerType" variable and the "TrainCategory" variable.

From variable value(s) (variableValues)

Find all of the data groups for the "PowerType" variable with the values "DMU" and "HST" for the "TrainCategory" of "Express Passenger":

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(matchMode="combinations",
  variableValues=list("TrainCategory"="Express Passenger", "PowerType"=c("DMU", "HST")))
pt$setStyling(groups=groups, declarations=list("background-color"="#00FFFF"))
pt$renderPivot()

In the above example, the highlighted "DMU" and "HST" data groups are subject to the "Express Passenger" filtering since they are underneath that data group.

The "combinations" match mode effectively AND's the criteria together, i.e. the data groups must match both "TrainCategory"="Express Passenger" AND "PowerType"=("DMU" OR "HST").

The "simple" match mode, by contrast, effectively OR's the criteria together, i.e. the data groups must match either "TrainCategory"="Express Passenger" OR "PowerType"=("DMU" OR "HST"). Changing the match mode back to simple (but otherwise leaving the previous example unchanged):

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(
  variableValues=list("TrainCategory"="Express Passenger", "PowerType"=c("DMU", "HST")))
pt$setStyling(groups=groups, declarations=list("background-color"="#00FFFF"))
pt$renderPivot()

Another example - finding all of the "PowerType" groups under "Express Passenger":

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(matchMode="combinations", variableNames="PowerType",
  variableValues=list("TrainCategory"="Express Passenger"))
pt$setStyling(groups=groups, declarations=list("background-color"="#00FFFF"))
pt$renderPivot()

Selecting a specific sub-total

To select the sub-total data group under "Express Passenger":

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
groups <- pt$findColumnDataGroups(matchMode="combinations", 
  variableValues=list("TrainCategory"="Express Passenger", "PowerType"="**"))
pt$setStyling(groups=groups, declarations=list("background-color"="#00FFFF"))
pt$renderPivot()

Restricting the search

The following example illustrates the use of the rowNumbers, columnNumbers and cells arguments to restrict the data group search:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$addRowDataGroups("Status")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()

# find the data groups that are related to column 2
grps <- pt$findColumnDataGroups(columnNumbers=2)
pt$setStyling(groups=grps, declarations=list("background-color"="yellow"))

# find the DMU data groups that are related to columns 5 to 8 
grps <- pt$findColumnDataGroups(columnNumbers=5:8, variableValues=list("PowerType"="DMU"))
pt$setStyling(groups=grps, declarations=list("background-color"="orange"))

# find a totals data group that is related to the cell at (3, 7) 
cells <- pt$getCell(3, 7)
pt$setStyling(cells=cells, declarations=list("background-color"="pink"))
grps <- pt$findColumnDataGroups(variableValues=list("PowerType"="**"), cells=cells)
pt$setStyling(groups=grps, declarations=list("background-color"="pink"))

# find the data groups at the second level in the hierarchy that are related to row 3
grps <- pt$findRowDataGroups(atLevel=2, rowNumbers=3)
pt$setStyling(groups=grps, declarations=list("background-color"="lightcyan"))

# find the data groups related to rows 5 to 10 with Status "A" or "R"
grps <- pt$findRowDataGroups(rowNumbers=5:10, variableValues=list("Status"=c("A", "R")))
pt$setStyling(groups=grps, declarations=list("background-color"="plum"))

# find the data groups related to row 11:13 or cells in the range (12, 4) to (15, 5)
cells <- pt$getCells(rowNumbers=12:15, columnNumbers=4:5, matchMode="combinations")
pt$setStyling(cells=cells, declarations=list("background-color"="palegreen"))
grps <- pt$findRowDataGroups(rowNumbers=11:13, cells=cells)
pt$setStyling(groups=grps, declarations=list("background-color"="palegreen"))

pt$renderPivot()

Getting Cells

The getCells() function is used to retrieve one or more cells in the body of the pivot table using row/column numbers or row/column data groups. This function has the following parameters:

The row/column number arguments can be specified in two different ways depending on the value of the specifyCellsAsList argument - the default value is TRUE.

The getCells() function returns a list of cell objects.

Getting cells when specifyCellsAsList=TRUE

To get cells when specifyCellsAsList=TRUE (the default) and matchMode="simple" (the default):

Retrieving the 2nd row, 4th column, 5th row-7th column cell and 4th row-8th column cell:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$getCells(specifyCellsAsList=TRUE, 
                     rowNumbers=2, columnNumbers=4, 
                     cellCoordinates=list(c(5, 7), c(4, 8)))
pt$setStyling(cells=cells, declarations=list("background-color"="#FFCC66"))
pt$renderPivot()

Getting cells when specifyCellsAsList=FALSE

To get cells when specifyCellsAsList=FALSE (not the default) and matchMode="simple" (the default):

Retrieving the 2nd row, 4th column, 5th row-7th column cell and 4th row-8th column cell:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$getCells(specifyCellsAsList=FALSE, rowNumbers=c(2, NA, 5, 4), columnNumbers=c(NA, 4, 7, 8))
pt$setStyling(cells=cells, declarations=list("background-color"="#FFCC66"))
pt$renderPivot()

Getting cells when matchMode="combinations"

matchMode="combinations" means that each cell must match the row criteria AND the column criteria. (Contrast this to matchMode="simple" where each cell must match either the row criteria OR the column criteria).

The example below illustrates the difference the matchMode makes:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
# "simple" match mode on rowNumbers=2 and columnNumbers=2:3
# result:  all cells in both rows and columns are matched,
#          i.e. each cell only needs to match one condition
#          (row number = 2) OR (column number = 2 or 3)
cells <- pt$getCells(rowNumbers=2, columnNumbers=2:3, matchMode="simple")
pt$setStyling(cells=cells, declarations=list("background-color"="#FFCC66"))
# "combinations" match mode on rowNumbers=4:5 and columnNumbers=5:7
# result:  does not match entire rows nor entire columns,
#          i.e. each cell must match both conditions
#          (row number = 4 or 5) AND (column number = 5, 6 or 7)
cells <- pt$getCells(rowNumbers=4:5, columnNumbers=5:7, matchMode="combinations")
pt$setStyling(cells=cells, declarations=list("background-color"="#FFCC66"))
pt$renderPivot()

Getting cells from row/column data groups

The groups, rowGroups and columnGroups arguments provide a simple way to retrieve the cells related to one or more data groups.

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$addRowDataGroups("Status")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()

rgrps <- pt$leafRowGroups
cgrps <- pt$leafColumnGroups

# get the cells under the first two column data groups
cells <- pt$getCells(groups=cgrps[1:2])
pt$setStyling(cells=cells, declarations=list("background-color"="yellow"))

# get the cells related to the data groups in rows 2 and 4
cells <- pt$getCells(groups=list(rgrps[[2]], rgrps[[4]]))
pt$setStyling(cells=cells, declarations=list("background-color"="lightblue"))

# get the cells related to the data groups in columns 4 and 5
cells <- pt$getCells(columnGroups=cgrps[4:5])
pt$setStyling(cells=cells, declarations=list("background-color"="lightgreen"))

# get the cells related to the data groups in rows 6 and 8 
cells <- pt$getCells(rowGroups=list(rgrps[[6]], rgrps[[8]]))
pt$setStyling(cells=cells, declarations=list("background-color"="pink"))

# get the cells related to the data groups in rows 11, 13 and 15, columns 7 and 8 
cells <- pt$getCells(rowGroups=list(rgrps[[11]], rgrps[[13]], rgrps[[15]]),
                     columnGroups=list(cgrps[[7]], cgrps[[8]]), 
                     matchMode="combinations")
pt$setStyling(cells=cells, declarations=list("background-color"="orange"))

pt$renderPivot()

Finding Cells

The findCells() function is used to search for cells within the body of the pivot table matching one or more criteria. The function returns a list of cell objects. This function has the following parameters:

If multiple variable names and values are specified, then findCells() searches for cells that match all of the criteria - i.e. the equivalent of the combinations match method described above.

In addition, the following parameters can also be used with pt$findCells():

Broadly speaking, these additional parameters function identically to the parameters of the same name in the pt$getCells() method.

The following parameters control how the cells matching the criteria specified above are returned:

Note that lowN and highN apply to all cell values in the pivot table, e.g. including totals. Use the other arguments described above to exclude the types of cell you are not interested in, e.g. specifying totals="exclude" to stop totals being included.

Several examples of the above are given below.

Finding cells from headings

From heading variable name(s) (variableNames)

Finding cells that reference the "PowerType" variable:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(variableNames="PowerType")
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

All of the cells above reference the "PowerType" variable. For the findCells() function, the variableNames argument is only really used when a pivot table is constructed that has a custom layout.

From heading variable value(s) (variableValues)

Finding cells that reference the "DMU" and "HST" values for the "PowerType" variable:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(variableValues=list("PowerType"=c("DMU", "HST")))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

Finding cells that reference the "DMU" and "HST" values for the "PowerType" variable and reference the "London Midland" value for the "TOC" variable:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(variableValues=list("PowerType"=c("DMU", "HST"), "TOC"="London Midland"))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

totals

Finding only totals cells that reference the "PowerType" variable:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(variableNames="PowerType", totals="only")
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

In the example, probably more total cells have been matched than expected.

To explicitly match only the total columns for the "PowerType" variable, specify two asterixes as the variable value:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(variableValues=list("PowerType"="**"))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

To explicitly match only the sub-total columns for the "PowerType" variable (i.e. excluding the far right TrainCategory total column), use the following:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(variableValues=list("TrainCategory"="!*", "PowerType"="**"))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

To find the grand total cell:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(variableValues=list("TrainCategory"="**", "PowerType"="**", "TOC"="**"))
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

Finding cells by cell value (aka. Conditional Formatting)

The findCells() and getCells() functions can be used to help conditionally format the cells of a pivot table based on the cell values (i.e. calculation values).

Basic conditional formatting

For example, to highlight in red those cells in the basic example pivot table that have a value between 30000 and 50000:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(minValue=30000, maxValue=50000, includeNull=FALSE, includeNA=FALSE)
pt$setStyling(cells=cells, declarations=list("background-color"="#FFC7CE", "color"="#9C0006"))
pt$renderPivot()

Another example: analysing the average arrival delay (in minutes) in the morning peak time, for the top 20 origin stations, broken down by the hour of the day (i.e. 5am, 6am, 7am, etc.). The data for these examples is derived as follows:

# calculate arrival delay information
library(dplyr)
library(lubridate)

stations <- mutate(trainstations, CrsCodeChr=as.character(CrsCode))

topOrigins <- bhmtrains %>%
  mutate(OriginChr=as.character(Origin)) %>%
  filter(Origin != "BHM") %>%
  group_by(OriginChr) %>%
  summarise(TotalTrains = n()) %>%
  ungroup() %>%
  top_n(20, TotalTrains)

trains <- bhmtrains %>%
  mutate(OriginChr=as.character(Origin), DestinationChr=as.character(Destination)) %>%
  inner_join(topOrigins, by=c("OriginChr"="OriginChr")) %>%
  inner_join(stations, by=c("OriginChr"="CrsCodeChr")) %>%
  inner_join(stations, by=c("DestinationChr"="CrsCodeChr")) %>%
  select(TOC, TrainCategory, PowerType, Origin=StationName.x, 
         GbttArrival, ActualArrival, GbttDeparture, ActualDeparture) %>%
  mutate(GbttDateTime=if_else(is.na(GbttArrival), GbttDeparture, GbttArrival),
         GbttHourOfDay=hour(GbttDateTime),
         ArrivalDeltaMins=difftime(ActualArrival, GbttArrival, units="mins"),
         ArrivalDelayMins=ifelse(ArrivalDeltaMins<0, 0, ArrivalDeltaMins)) %>%
  filter(GbttHourOfDay %in% c(5, 6, 7, 8, 9, 10)) %>%
  select(TOC, TrainCategory, PowerType, Origin, GbttHourOfDay, ArrivalDelayMins)

The first few rows in the trains data frame look like:

renderBasicTable(as.matrix(trains[1:8,]), columnNamesAsHeader=TRUE, 
                 columnAlignment=c("left", "left", "left", "left", "right", "right"))

Assume we wish to format the average arrival delay as follows:

There are two approaches we can take - either use mapStyling(), or use findCells() and setStyling(). mapStyling() is described in the Styling vignette.

Using mapStyling() with mapType="range" is the more concise approach:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(trains)
pt$addColumnDataGroups("GbttHourOfDay")
pt$addRowDataGroups("Origin")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()", visible=FALSE)
pt$defineCalculation(calculationName="TotalDelayMins", 
                     summariseExpression="sum(ArrivalDelayMins, na.rm=TRUE)", visible=FALSE)
pt$defineCalculation(calculationName="AvgDelayMins", type="calculation", 
                     basedOn=c("TotalDelayMins", "TotalTrains"),
                     calculationExpression="values$TotalDelayMins/values$TotalTrains",
                     format="%.1f")
pt$evaluatePivot()

# apply the background-color styling
pt$mapStyling(cells=pt$allCells, styleProperty="background-color", 
              valueType="color", mapType="range",
              mappings=list(0, "#C6EFCE", 2, "#FFEB9C", 4, "#FFC7CE"))

# apply the color styling
pt$mapStyling(cells=pt$allCells, styleProperty="color", 
              valueType="color", mapType="range",
              mappings=list(0, "#006100", 2, "#9C5700", 4, "#9C0006"))

pt$renderPivot()

Alternatively, using findCells() and setStyling():

library(pivottabler)
pt <- PivotTable$new()
pt$addData(trains)
pt$addColumnDataGroups("GbttHourOfDay")
pt$addRowDataGroups("Origin")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()", visible=FALSE)
pt$defineCalculation(calculationName="TotalDelayMins", 
                     summariseExpression="sum(ArrivalDelayMins, na.rm=TRUE)", visible=FALSE)
pt$defineCalculation(calculationName="AvgDelayMins", type="calculation", 
                     basedOn=c("TotalDelayMins", "TotalTrains"),
                     calculationExpression="values$TotalDelayMins/values$TotalTrains",
                     format="%.1f")
pt$evaluatePivot()

# apply the green style for an average arrival delay of between 0 and 2 minutes
cells <- pt$findCells(minValue=0, maxValue=2, includeNull=FALSE, includeNA=FALSE)
pt$setStyling(cells=cells, declarations=list("background-color"="#C6EFCE", "color"="#006100"))

# apply the yellow style for an average arrival delay of between 2 and 4 minutes
cells <- pt$findCells(minValue=2, maxValue=4, includeNull=FALSE, includeNA=FALSE)
pt$setStyling(cells=cells, declarations=list("background-color"="#FFEB9C", "color"="#9C5700"))

# apply the red style for an average arrival delay of 4 minutes or greater
cells <- pt$findCells(minValue=4, includeNull=FALSE, includeNA=FALSE)
pt$setStyling(cells=cells, declarations=list("background-color"="#FFC7CE", "color"="#9C0006"))

pt$renderPivot()

Conditional formatting using colour gradients

The examples above uses three sets of colours. It is also possible to apply these colours as a continuous colour gradient, e.g. based on the example above an average arrival delay of 3 minutes would be formatted using a mixed yellow-green colour. Again, this can be done following the same two approaches.

Using mapStyling() with mapType="continuous" is much more concise:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(trains)
pt$addColumnDataGroups("GbttHourOfDay")
pt$addRowDataGroups("Origin")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()", visible=FALSE)
pt$defineCalculation(calculationName="TotalDelayMins", 
                     summariseExpression="sum(ArrivalDelayMins, na.rm=TRUE)", visible=FALSE)
pt$defineCalculation(calculationName="AvgDelayMins", type="calculation", 
                     basedOn=c("TotalDelayMins", "TotalTrains"),
                     calculationExpression="values$TotalDelayMins/values$TotalTrains",
                     format="%.1f")
pt$evaluatePivot()

# apply the background-color styling
pt$mapStyling(cells=pt$allCells, styleProperty="background-color", 
              valueType="color", mapType="continuous",
              mappings=list(0, "#C6EFCE", 2, "#FFEB9C", 4, "#FFC7CE"))

# apply the color styling
pt$mapStyling(cells=pt$allCells, styleProperty="color", 
              valueType="color", mapType="continuous",
              mappings=list(0, "#006100", 2, "#9C5700", 4, "#9C0006"))

pt$renderPivot()

Using findCells() and setStyling() requires some additional helper functions to be defined, to assist with calculating a colour in a colour gradient:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(trains)
pt$addColumnDataGroups("GbttHourOfDay")
pt$addRowDataGroups("Origin")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()", visible=FALSE)
pt$defineCalculation(calculationName="TotalDelayMins", 
                     summariseExpression="sum(ArrivalDelayMins, na.rm=TRUE)", visible=FALSE)
pt$defineCalculation(calculationName="AvgDelayMins", type="calculation", 
                     basedOn=c("TotalDelayMins", "TotalTrains"),
                     calculationExpression="values$TotalDelayMins/values$TotalTrains",
                     format="%.1f")
pt$evaluatePivot()

# colour gradient helper functions
scaleNumber <- function(n1, n2, vMin, vMax, value) {
  if(n1==n2) return(n1)
  v <- value
  if(v < vMin) v <- vMin
  if(v > vMax) v <- vMax
  if(n1<n2) {
    return(n1+((v-vMin)/(vMax-vMin)*(n2-n1)))
  }
  else {
    return(n1-((v-vMin)/(vMax-vMin)*(n1-n2)))
  }
}
scale2Colours <- function(clr1, clr2, vMin, vMax, value) {
  r <- round(scaleNumber(clr1$r, clr2$r, vMin, vMax, value))
  g <- round(scaleNumber(clr1$g, clr2$g, vMin, vMax, value))
  b <- round(scaleNumber(clr1$b, clr2$b, vMin, vMax, value))
  return(paste0("#",
                format(as.hexmode(r), width=2), 
                format(as.hexmode(g), width=2), 
                format(as.hexmode(b), width=2)))
}
scale3Colours <- function(clr1, clr2, clr3, vMin, vMid, vMax, value) {
  if(value <= vMid) return(scale2Colours(clr1, clr2, vMin, vMid, value))
  else return(scale2Colours(clr2, clr3, vMid, vMax, value))
}
hexToClr <- function(hexclr) {
  clr <- list()
  clr$r <- strtoi(paste0("0x", substr(hexclr, 2, 3)))
  clr$g <- strtoi(paste0("0x", substr(hexclr, 4, 5)))
  clr$b <- strtoi(paste0("0x", substr(hexclr, 6, 7)))
  return(clr)
}

# colour constants
textClrGreen <- hexToClr("#006100")
textClrYellow <- hexToClr("#9C5700")
textClrRed <- hexToClr("#9C0006")
backClrGreen <- hexToClr("#C6EFCE")
backClrYellow <- hexToClr("#FFEB9C")
backClrRed <- hexToClr("#FFC7CE")

# specify some conditional formatting, calculating the appropriate text colour 
# and back colour for each cell.
cells <- pt$findCells(includeNull=FALSE, includeNA=FALSE)
formatCell <- function(cell) {
  value <- cell$rawValue
  textClr <- scale3Colours(textClrGreen, textClrYellow, textClrRed, 0.5, 2, 4, value)
  backClr <- scale3Colours(backClrGreen, backClrYellow, backClrRed, 0.5, 2, 4, value)
  pt$setStyling(cells=cell, declarations=list("background-color"=backClr, "color"=textClr))
}
invisible(lapply(cells, formatCell))
pt$renderPivot()

Finding the minimum/maximum values or top/bottom N values

The lowN and highN arguments allow the cells containing the minimum/maximum values or the top/bottom N values to be easily found.

These arguments generally need to be combined with other arguments to find the cells you are interested in as shown in the examples below.

Finding the cell containing the maximum value

By default, the totals are included, so just specifying highN=1 will probably locate the grand total cell:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(highN=1)
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

To exclude the grand total cell, specify totals="exclude":

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(highN=1, totals="exclude")
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

The variable names and values for the cell containing the maximum value can be retrieved as a list, e.g. using the example above:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(highN=1, totals="exclude")
print(cells[[1]]$rowColFilters$filteredValues)

Finding the top 3 cells

Follow the same method as above but specify highN=3:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(highN=3, totals="exclude")
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

Finding the cell containing the maximum value for a specific data group

To highlight the maximum value for a specific data group, then specify the group value when calling pt$findCells(), e.g. to find the maximum value for "Express Passenger":

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(highN=1, 
                      variableValues=list("TrainCategory"="Express Passenger"),
                      totals="exclude")
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

Combinations of values can be specified to select exactly where in the pivot table to find the maximum value, e.g. to find the maximum value for the "Ordinary Passenger" and "DMU" combination:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
cells <- pt$findCells(highN=1, 
                      variableValues=list("TrainCategory"="Ordinary Passenger",
                                          "PowerType"="DMU"),
                      totals="exclude")
pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
pt$renderPivot()

Finding the cell containing the maximum value in each column

To highlight the maximum value for each column, call pt$findCells() specifying the column number:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()
for(c in 1:pt$columnCount) {
  cells <- pt$findCells(highN=1, columnNumbers=c, totals="exclude")
  pt$setStyling(cells=cells, declarations=list("background-color"="#00FF00"))
}
pt$renderPivot()

Finding cells - additional criteria

The example below illustrates using additional criteria (row/column numbers and row/column data groups) to restrict the cells that are searched:

library(pivottabler)
pt <- PivotTable$new()
pt$addData(bhmtrains)
pt$addColumnDataGroups("TrainCategory")
pt$addColumnDataGroups("PowerType")
pt$addRowDataGroups("TOC")
pt$addRowDataGroups("Status")
pt$defineCalculation(calculationName="TotalTrains", summariseExpression="n()")
pt$evaluatePivot()

# get the data groups on each axis
rgrps <- pt$leafRowGroups
cgrps <- pt$leafColumnGroups

# search the first two columns for cells that aren't empty
cells <- pt$findCells(groups=cgrps[1:2], emptyCells="exclude")
pt$setStyling(cells=cells, declarations=list("background-color"="yellow"))

# search the sixth and eighth rows for cells with values greater than 10
cells <- pt$findCells(groups=list(rgrps[[6]], rgrps[[8]]), minValue=10)
pt$setStyling(cells=cells, declarations=list("background-color"="lightgreen"))

# search the cells in rows 2 to 4 in columns 7 to 8 to find non-empty cells
cells <- pt$findCells(rowNumbers=2:4, columnGroups=cgrps[7:8],
                      emptyCells="exclude",
                      rowColumnMatchMode="combinations")
pt$setStyling(cells=cells, declarations=list("background-color"="pink"))

# search the cells in rows 11 to 17 in columns 4, 5 and 7
# highlight non-empty cells in this range with value 10000 or greater
rgrps <- rgrps[11:17]
cgrps <- list(cgrps[[4]], cgrps[[5]], cgrps[[7]])
cells <- pt$findCells(rowGroups=rgrps, columnGroups=cgrps,
                      rowColumnMatchMode="combinations")
pt$setStyling(cells=cells, declarations=list("background-color"="blanchedalmond"))

cells <- pt$findCells(rowGroups=rgrps, columnGroups=cgrps,
                     rowColumnMatchMode="combinations",
                     minValue=10000, emptyCells="exclude")
pt$setStyling(cells=cells, declarations=list("background-color"="orange"))

# render the pivot
pt$renderPivot()

Summary

Further Reading

The full set of vignettes is:

  1. Introduction
  2. Data Groups
  3. Calculations
  4. Regular Layout
  5. Outputs
  6. Latex Output
  7. Styling
  8. Finding and Formatting
  9. Cell Context
  10. Navigating a Pivot Table
  11. Irregular Layout
  12. Performance
  13. Excel Export
  14. Shiny
  15. Appendix: Details
  16. Appendix: Calculations
  17. Appendix: Class Overview


cbailiss/pivottabler documentation built on Oct. 14, 2023, 9:38 a.m.