library(hpcvis)

options(scipen = 3)
knitr::opts_chunk$set(
  cache   = TRUE,
  comment = NA,
  error   = FALSE,
  tidy    = FALSE
)

Introduction

In this vignette, we will introduce the facilities for visualizing hardware counter data, as generated from the pbdPAPI package [[@papipackage]]. Before proceeding, the reader is encouraged to start with the Introducing hpcvis vignette, which is distributed with the hpcvis package [[@hpcvis]].

In the examples, we will be using the papiplot() function to manage plotting. Currently, only pbdPAPI::system.cache() and pbdPAPI::cachebench() generated data is supported. Future versions will contain support for additional data.

Importing data

It is quite possible, or perhaps even likely, that the computing environment in which you wish to analyze performance counter data is very different from the one which will produce it. This is especially true for the pbdPAPI package, which must run in a Linux environment. See the discussion in the Introducting hpcvis vignette for more information.

To facilitate this process, the pbdPAPI package offers the utility papiexport(), which allows one to quickly export the profiling results. For example, say you have the object x defined as:

x <- system.cache(rnorm(1e4))
x
## L1 Cache Misses: 5669
## L2 Cache Misses: 6910
## L3 Cache Misses: 3526 

We can export this easily as follows:

papiexport(x)
## x <-
## structure(list(`L1 cache misses` = 5669, `L2 cache misses` = 6910, 
##     `L3 cache misses` = 3526), .Names = c("L1 cache misses", 
## "L2 cache misses", "L3 cache misses"), class = c("papi_output", 
## "papi_cache"), call = "rnorm(10000)")

We can even export this directly to a file if we prefer:

papiexport(x, "/path/to/myfile")

And in fact, any number of objects can be passed to papiexport(). So if you have objects x, y, and z:

# to the R console
papiexport(x, y, z)
# to a file
papiexport(x, y, z, "/path/to/myfile")

Plotting Counter Data

Before proceeding, let us load some sample data. We will use some data generated from the system.cache() function, profiling cache misses of R's random normal generator.

library(hpcvis)
file <- system.file("testdata/cache_misses.Rda", package="hpcvis")
load(file)

This loads three objects named cm_1e4, cm_5e4, and cm_1e5. These show the measured cache misses (levels 1-3) of the random normal generator for R, executed on a fourth generation Intel Core i5 processor. The subscripts denote the length of the output vector, i.e., the total number of random normal numbers generated. So cm_1e4 contains the cache misses of the execution of the command rnorm(1e4).

Basic Usage

Let us being by displaying a plot of the length-1000 vector by invoking the papiplot() function:

papiplot(cm_1e4)

If we pass multiple object, say all 3 of the aforementioned, then papiplot() will automatically handle plotting the three simultaneously on the same axis:

papiplot(cm_1e4, cm_5e4, cm_1e5)

Titles and labels

If we so desire, we can easily control what names are displayed in the plots. One example is the title of the plot. By default, a title for the plot is provided based on the input type. We can change it by passing a string, say title="My Plot". Or if we wish to have no title at all, we can pass title=NULL:

papiplot(cm_1e4, cm_5e4, cm_1e5, title=NULL)

Similarly, we can control the displayed names of the operations. So for example, we can replace the explict names (which are captured automatically) of rnorm(1e4), etc, by say "small", "medium", and "large":

opnames <- c("small", "medium", "large")
papiplot(cm_1e4, cm_5e4, cm_1e5, opnames=opnames)

Or we can remove the operation names entirely (note, this is probably a bad idea unless you are only plotting one operation):

papiplot(cm_1e4, opnames=NULL, title="Cache Misses for rnorm(1e4)")

Finally, if we so desire, we can place the numeric value of each bar at the top of the bar:

papiplot(cm_1e4, cm_5e4, cm_1e5, bar.label=TRUE)

This is perhaps not useful and not generally recommended, but some journals prefer this style, and so we include the option.

Group Options

If we prefer, we can color the bars by group by simply adding color=TRUE to our papiplot() command:

papiplot(cm_1e4, cm_5e4, cm_1e5, color=TRUE) 

Coloring bars is perhaps more useful with another option. We can change how we split the plot, either by the given operation (the default) or by the cache level:

papiplot(cm_1e4, cm_5e4, cm_1e5, color=TRUE, facet.by="level")

Here the label.angle and opnames options can be handy:

papiplot(cm_1e4, cm_5e4, cm_1e5, color=TRUE, facet.by="level", label.angle=15, opnames=opnames)

Final Note

The return of papiplot() is a ggplot2 compatible plot, and so all of the options and tools available for such plots (which are numerous) apply.

Benchmark Plotters

Later versions of pbdPAPI include some "benchmarker" utilities, which automatically handle the multiple evaluation of functions. For example, it has the function cachebench(), which will generate multiple iterations (by default 10) of each provided function, and track the cache misses as measured by system.cache().

We again by loading some example data. The data measures cache misses from the same processes as in the example above, but with 10 iterations for each operation.

file <- system.file("testdata/cachebench.Rda", package="hpcvis")
load(file)

The same function papiplot() applied to this object will produce a set of box-and-whiskers plots, displaying the variation in function execution:

papiplot(rnorm_cachebench)

As before, the optional function arguments apply:

papiplot(rnorm_cachebench, facet.by="operation", label.angle=15)

Legal

© 2016 Drew Schmidt.

Permission is granted to make and distribute verbatim copies of this vignette and its source provided the copyright notice and this permission notice are preserved on all copies.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. The findings and conclusions in this article have not been formally disseminated by the U.S. Department of Health \& Human Services nor by the U.S. Department of Energy, and should not be construed to represent any determination or policy of University, Agency, Administration and National Laboratory.

This manual may be incorrect or out-of-date. The authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.

References



RBigData/hpcvis documentation built on May 8, 2019, 4:54 a.m.