title: "evalwaterfallr" author: "JC" date: "2017-02-22" output: rmarkdown::html_vignette keep_md: true vignette: > %\VignetteIndexEntry{evalwaterfallr} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8}
This document describes how to create order-independent permutations and waterfall graphics using evalwaterfallr. There are three functions in the package. Three of them are intended to be used directly: waterfallPrep()
, addwaterfallPrep()
and waterfallPlot()
. The wParamPermute()
function is called by waterfallPrep()
and could be used on its own by some users, but that scenario it is not expected to occur often.
Permutations of mutliplicative parameters can be completed by straightforward maths, but are error prone and time consuming in spreadsheet applications. The time and risk increases with the number of parameters. A three parameter equation requires six permutations, but a four parameter equation requires 24. This package automates the permutation and creates tables that can be used for creating the waterfall plot (within the package) or exported for other uses. In addition, it extends the methods for additive parameters to ensure consistency in reporting for additive and multiplicative parameters.
For motivation and relevant readings, please see the package Readme.
This function takes the input of multiplicative parameters (up to 10) as a dataframe and the key values of Gross Reported, NTG Reported and NTG Evaluated. Note: these terms are from the field of energy efficiency evaluation, but could be extended to other applications. Here are the conceptualizations of the key values:
gross.report
): The gross ex ante savings. The function defaults to 100.ntg.report
): The fraction of the ex ante gross savings that is predicted to occur due to program influence. The function defaults to 1.ntg.eval
): The fraction of the gross savings that are found by evaluation to occur due to program influence. The function defaults to 1, or no assumed losses or gains.All permutations of the order of multiplicative adjustments are determined by calling wParamPermute()
inside of waterfallPrep()
. This is called separately for the net and hybrid permutations. For the net permutation, the multiplicative parameters and the NTG realization rate (the fraction of ntg.eval/ntg.report
). For the hybrid permutation, the Reported NTG, the multiplicative parameters, and the NTG realization rate.
The data included in the package, rawparamdf
, has four impact parameters, for an imaginary lighting program. They are named:"ISR","DeltaWatts","HOU","IE" to abbreviate In Service Rate, Delta Watts, Hours of Use, and Interactive Effects, respectively. These names are used throughout the code, but they could be any character
string.
rawparamdf <- data.frame( # lighting example
params = c("ISR","deltaWatts","HOU","IE"),
value = c(0.5, 0.7, 1.2, 1.5),
stringsAsFactors = FALSE
)
rawparamdf
#> params value
#> 1 ISR 0.5
#> 2 deltaWatts 0.7
#> 3 HOU 1.2
#> 4 IE 1.5
By default, waterfallPrep()
only requires the parameter table, and it will use defaults for key values and assume all tables of output are desired. The defaults are: gross.report = 100
, NTG.report = 1
, and NTG.eval = 1
.
waterfallPrep(rawparamdf)# minimal call
#> $`No Permutatation`
#> variable given total base increase decrease
#> 1 Ex Ante Gross 100.0 100 NA NA NA
#> 2 ISR 0.5 NA 50 0 50
#> 3 deltaWatts 0.7 NA 35 0 15
#> 4 HOU 1.2 NA 35 7 0
#> 5 IE 1.5 NA 42 21 0
#> 6 Ex Post Gross NA 63 NA NA NA
#> 7 Ex Post NTG 1.0 NA 63 0 0
#> 8 Ex Post Net NA 63 NA NA NA
#>
#> $`Gross Waterfall`
#> variable given total base increase decrease
#> 1 Ex Ante Gross 100.0 100 NA NA NA
#> 2 ISR 0.5 NA 42.20833 0.00000 57.79167
#> 3 deltaWatts 0.7 NA 12.08333 0.00000 30.12500
#> 4 HOU 1.2 NA 12.08333 15.70833 0.00000
#> 5 IE 1.5 NA 27.79167 35.20833 0.00000
#> 6 Ex Post Gross NA 63 NA NA NA
#> 7 Ex Post NTG 1.0 NA 63.00000 0.00000 0.00000
#> 8 Ex Post Net NA 63 NA NA NA
#>
#> $`Net Waterfall`
#> variable given total base increase decrease
#> 1 Ex Ante Gross 100.0 100 NA NA NA
#> 2 Ex Ante NTG 1.0 NA 100.00000 0.00000 0.00000
#> 3 Ex Ante Net NA 100 NA NA NA
#> 4 ISR 0.5 NA 42.20833 0.00000 57.79167
#> 5 deltaWatts 0.7 NA 12.08333 0.00000 30.12500
#> 6 HOU 1.2 NA 12.08333 15.70833 0.00000
#> 7 IE 1.5 NA 27.79167 35.20833 0.00000
#> 8 RR NTG 1.0 NA 63.00000 0.00000 0.00000
#> 9 Ex Post Net NA 63 NA NA NA
#>
#> $`Hybrid Waterfall`
#> variable given total base increase decrease calc
#> 1 Ex Ante Gross 100.0 100 NA NA NA NA
#> 2 Ex Ante NTG 1.0 NA 100.00000 0.00000 0.00000 0.0000000
#> 3 ISR 0.5 NA 42.20833 0.00000 57.79167 -0.5779167
#> 4 deltaWatts 0.7 NA 12.08333 0.00000 30.12500 -0.3012500
#> 5 HOU 1.2 NA 12.08333 15.70833 0.00000 0.1570833
#> 6 IE 1.5 NA 27.79167 35.20833 0.00000 0.3520833
#> 7 RR NTG 1.0 NA 63.00000 0.00000 0.00000 0.0000000
#> 8 Ex Post Net NA 63 NA NA NA NA
# this is equivalent to
# not run
# waterfallPrep(rawparamdf,
# gross.report = 100, NTG.report = 1, NTG.eval = 1, #defaults
# altparamnames = NULL, # default
# output="all") #defaults
Alternatively, we can store one of the output tables, here: "gross".
gross_tab <- waterfallPrep(rawparamdf, output="gross")
gross_tab
#> variable given total base increase decrease
#> 1 Ex Ante Gross 100.0 100 NA NA NA
#> 2 ISR 0.5 NA 42.20833 0.00000 57.79167
#> 3 deltaWatts 0.7 NA 12.08333 0.00000 30.12500
#> 4 HOU 1.2 NA 12.08333 15.70833 0.00000
#> 5 IE 1.5 NA 27.79167 35.20833 0.00000
#> 6 Ex Post Gross NA 63 NA NA NA
#> 7 Ex Post NTG 1.0 NA 63.00000 0.00000 0.00000
#> 8 Ex Post Net NA 63 NA NA NA
Usually, Gross Reported and NTG values are known, so they should be adjusted from the defaults. The parameter names can also be changed.
Here, let's assume that the Gross Reported (gross.report
) value is 200 and the NTG fractions (ntg.report
and NTG.eval
) are 0.8 and 0.6, respectively. We want to rename our parameters and only get the net permutation table. Note that altparamnames
must have the same number of arguments as df
has variables (rows). The function will stop with message "alternate parameter name vector length not the same as parameter length" if they do not match.
# assume gross.report is 200, NTG.report = 0.8, and NTG.eval=0.6
net_tab <- waterfallPrep(rawparamdf, 200, .8, .6,
altparamnames=c("Installation\nRates","delta\nWatts",
"Hours\nof Use","Interactive\nEffects"),
output="net")
net_tab
#> variable given total base increase decrease
#> 1 Ex Ante Gross 200.0 200.0 NA NA NA
#> 2 Ex Ante NTG 0.8 NA 160.00000 0.00000 40.00000
#> 3 Ex Ante Net NA 160.0 NA NA NA
#> 4 Installation\nRates 0.5 NA 79.53000 0.00000 80.47000
#> 5 delta\nWatts 0.7 NA 37.26000 0.00000 42.27000
#> 6 Hours\nof Use 1.2 NA 37.26000 22.31333 0.00000
#> 7 Interactive\nEffects 1.5 NA 59.57333 50.26333 0.00000
#> 8 RR NTG 0.6 NA 75.60000 0.00000 34.23667
#> 9 Ex Post Net NA 75.6 NA NA NA
# this table can be exported for use in other programs or passed to the
# waterfallPlot() function directly, as shown below
# not run
# write.csv(net_tab, file="/path/to/dir/net_tab.csv")
The output of waterfallPrep()
can be used directly by waterfallPlot()
as shown below, or used in another software product, like Excel, to create desired graphs.
This function creates the permutations for additive, rather than multiplicative parameters. The output of addwaterfallPrep()
can be used directly by waterfallPlot()
as shown below, or used in another software product, like Excel, to create desired graphs. The additive parameters have no permutation for the gross table, but are permuted with NTG realization rate for the net table and permuted with the NTG reported and NTG realization rate for the hybrid table.
# assume gross.report is 100, NTG.report = 0.8, and NTG.eval=0.6
# assume our parameters are:
addrawparamdf <- data.frame( # excel example
params = c("A","B","C"),
value = c(-30, 20, -40),
stringsAsFactors = FALSE
)
add_tab <- addwaterfallPrep(addrawparamdf, 100, .8, .6,
output="all")
We can see the gross, net, and hybrid permutation tables:
add_tab[[2]] # gross
#> variable given total base increase decrease
#> 1 Ex Ante Gross 100.0 100 NA NA NA
#> 2 A -30.0 NA 70 0 30
#> 3 B 20.0 NA 70 20 0
#> 4 C -40.0 NA 50 0 40
#> 5 Ex Post Gross NA 50 NA NA NA
#> 6 Ex Post NTG 0.6 NA 30 0 20
#> 7 Ex Post Net NA 30 NA NA NA
add_tab[[3]] # net
#> variable given total base increase decrease
#> 1 Ex Ante Gross 100.0 100 NA NA NA
#> 2 Ex Ante NTG 0.8 NA 80 0 20
#> 3 Ex Ante Net NA 80 NA NA NA
#> 4 A -30.0 NA 59 0 21
#> 5 B 20.0 NA 59 14 0
#> 6 C -40.0 NA 45 0 28
#> 7 RR NTG 0.6 NA 30 0 15
#> 8 Ex Post Net NA 30 NA NA NA
add_tab[[4]] # hybrid
#> variable given total base increase decrease
#> 1 Ex Ante Gross 100.0 100 NA NA NA
#> 2 Ex Ante NTG 0.8 NA 86.66667 0.00000 13.33333
#> 4 A -30.0 NA 62.91667 0.00000 23.75000
#> 5 B 20.0 NA 62.91667 15.83333 0.00000
#> 6 C -40.0 NA 47.08333 0.00000 31.66667
#> 7 RR NTG 0.6 NA 30.00000 0.00000 17.08333
#> 8 Ex Post Net NA 30 NA NA NA
This is the waterfall plotting function. It is inspired by the code developed by James Kierstead Post on Watefall Plots with UK Emissions Data. waterfallPlot()
assumes the input dataframe is already in order and creates fill categories based on the values in the table. Intermediate totals are plotted here, which are not possible with James' waterfall()
. For data that are more categorical or do not require intermediate totals, check out his very useful waterfall gist.
We can see the multiplicative net permutation table from the example above.
waterfallPlot(net_tab)
What about all of the additive tables? Gross
waterfallPlot(add_tab[[2]])
Net
waterfallPlot(add_tab[[3]])
Hybrid
waterfallPlot(add_tab[[4]])
Alternatively, we can send a table that we may have from outside this package.
library(ggplot2)
library(scales)
library(dplyr)
# let's assume that we have rrdf and just want to plot it:
rrdf <- data.frame( # made up example
variable = c("Start","Factor 1","Factor 2","Factor 3","End"),
total = c(100, rep(NA, 3), 75),
base = c(NA, 75, 50, 50,NA),
increase = c(NA, 0, 0, 25, NA),
decrease = c(NA, 25, 25, 0, NA))
waterfallPlot(rrdf)
It is straightforward to change the colors and the labels. However, if xfactors
is supplied, it must have the same number of arguments as the data frame supplied. If it does not, the default values will be used, but the plot will still render.
library(dplyr)
# With another color palette. Note that totals stay grey.
When specific fonts or other design issues are preferred, it may be best to modify the waterfallPlot()
function to meet those needs rather than try to append ggplot2 calls.
This function does the underlying permutation. It is called by waterfallPrep()
to permute the factors passed by df. It returns a dataframe with the same parameter names and the average permuted value.
For the function call to waterfallPrep()
above, this is what was sent to and returned within the function. Note: had altparamnames
been defined, that would have been sent rather than rawparamdf[,1]
.
wParamPermute(rawparamdf[,1],rawparamdf[,2])
#> Warning: doParallel may make this function faster for large order
#> permutations if it is installed.
#> param.names avg.xx
#> 1 ISR -0.5779167
#> 2 deltaWatts -0.30125
#> 3 HOU 0.1570833
#> 4 IE 0.3520833
The number of intermediate values generated is the factorial of the number of parameters. For 3 factors, there are 6 permutations (3!=6); for 5 factors, there are 120 (5!=120); and for 7 factors, there are 5,040 (7!=5,040).
This function can take significant time for large order permutations, even if doParallels() is installed. A warning message is always presented that suggests doParallels() will make it faster for large permutations. Looking at the system time for arbitrary calls is informative.
# three
system.time(wParamPermute(c("ten","five","four"),
c(10,5,4)))
#> Warning: doParallel may make this function faster for large order
#> permutations if it is installed.
#> user system elapsed
#> 0.010 0.000 0.011
# five
system.time(wParamPermute(c("ten","five","four","six","ten"),
c(10,5,4,6,10)))
#> user system elapsed
#> 0.084 0.058 0.278
# seven
system.time(wParamPermute(c("ten","five","four","six","ten","two","six"),
c(10,5,4,6,10,2,6)))
#> user system elapsed
#> 1.732 0.020 1.756
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.