knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
library(pensionviewr)
library(tidyverse)
library(kableExtra)
library(reasontheme)

pensionviewr

The goal of pensionviewr is to simplify the process of gathering and visualizing public pension plan data from the Reason pension database. This repo contains the functions of the pensionviewr package, which once installed locally, provides helpful functions for creating and exporting graphics made in ggplot in the style used by the Reason Pension Integrity Project team.

Create Token:

To use devtools you'd need to authenticate yourself by creating Personal Access Tokens (PAT):

Installing pensionviewr

pensionviewr is not on CRAN, so you will have to install it directly from Github using devtools.

If you do not have the devtools package installed, you will have to run the first line in the code below as well.

install.packages('devtools')
devtools::install_github("ReasonFoundation/pensionviewr")

Using the functions:

A basic explanation and summary here:

planList()

  1. planList(): returns a stripped down list of the pension plans in the database along with their state and the internal databse id.

Example of how it is used in a standard workflow:

pl <- planList()
pl %>% 
  head() %>%
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

pullData()

  1. pullData(): pulls the data for a specified plan from the Reason pension database. pullData has two arguments: pullData(pl, plan_name)

  2. pl: A dataframe containing the list of plan names, states, and ids in the form produced by the planList() function.

  3. plan_name: A string enclosed in quotation marks containing a plan name as it is listed in the Reason pension database.

Example of how it is used in a standard workflow:

The next step would be to load the data for the specific plan of interest. Let's use the Vermont State Retirement System as an example. Let's first see what plans in Vermont are available:

VT <- pl %>% filter(state == 'Vermont')
VT %>% 
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

The full plan name we are interested in is there listed as "Vermont State Retirement System". We can pull the data for it now:

vtsrs_data <- pullData(pl, plan_name = "Vermont State Retirement System")
vtsrs_data %>% 
  head() %>%
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

pullStateData()

  1. pullStateData(): pulls all state-level pension data in wide format from the Reason pension database. pullStateData has a single argument: pullStateData(fy)

  2. fy: Starting fiscal year for the data pulled from the Reason pension database.

This could be the initial step for either making state-level pension analysis or then filter pulled data for a specific state plan. Example of how it is used in a standard workflow:

state.data <- pullStateData(2001)
New_Mexico.data <- state.data %>% filter(state == 'New Mexico')
NMERB.data <- state.data %>% filter(display_name == "New Mexico Educational Retirement Board")
state.data <- state.data %>% filter(year == 2019)#filter for 2019 for display
state.data %>% 
  head() %>%
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

pullSourceData()

  1. pullSourceData(): pulls pension data for a specific plan, along with data_source_name column in wide format from the Reason pension database. pullSourceData(pl,plan_name, fy) has 3 arguments:

  2. pl: A dataframe containing the list of plan names, states, and ids in the form produced by the planList() function.

  3. plan_name: A string enclosed in quotation marks containing a plan name as it is listed in the Reason pension database.
  4. fy: Starting fiscal year for the data pulled from the Reason pension database.

Example how this function is used:

NMERB.source <- pullSourceData(pl, "New Mexico Educational Retirement Board", 2001)
NMERB.source %>% 
  head() %>%
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

filterData()

  1. filterData(): filters existing data (data.frame/data.table format) keeping & renaming set of commonly used variables in pension analysis. filterData(Data, fy, source = FALSE) has 4 arguments:

  2. Data: A data table already pulled with pullData, pullStateData, or other ways.

  3. fy: Starting fiscal year for the data pulled from the Reason pension database.
  4. employee: Character designating type of employees covered (e.g. "teacher", "state", "local", "state and local", "police and fire").
  5. source: Set to FALSE. It should be set to source = TRUE if you filter data pulled with pullSourceData function.

Example of the workflow around the filtering function:

state.data <- pullStateData(2001)
filtered <- filterData(state.data, 2010)
filtered %>% 
  head() %>%
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

masterView()

  1. masterView(): allows to view list of mapped & unmapped columns in pension database per data source. masterView(source = NULL, expand = FALSE) has 2 arguments:

  2. source: main data sources: "Reason", "Public Plans Database", "Census".

  3. expand: Set to FALSE (default) to see mapped variables. Change to TRUE if you want to see all unmapped variables.

Example of how to use this function:

view.reason <- masterView("Reason", FALSE)
view.reason %>% 
  head() %>%
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

loadData

  1. loadData: loads the data for a specified plan from an Excel file. loadData has one argument:

loadData(file_name)

data_from_file <- loadData('data/NorthCarolina_PensionDatabase_TSERS.xlsx')

selectedData()

  1. selectedData(): selects the only the variables used in historical analyses. selectedData has one argument, wide_data, that is required:

selectedData(wide_data)

Back to the Kansas Public Employees' example. That is a lot of variables. The selectedData() function selects only a handful of needed variables:

df <- selectedData(vtsrs_data)
df %>% 
  head() %>%
  kable() %>%
  kable_styling(full_width = FALSE, position = "left")

glPlot()

  1. glPlot(): creates the 'Gain/Loss' plot using a CSV file as an input. glPlot has two arguments:

glPlot(filename, ylab_unit)

Example of how it is used in a standard workflow:

filename <- "data/GainLoss_data.csv" glPlot(filename)

linePlot()

  1. linePlot(): creates a line chart comparing up to 5 variables, such as ADEC vs. Actual contributions. linePlot() has 22 arguments, with data being required:

linePlot(data, title, caption, grid, interactive, Ytitle, treasury, inv.returns, ticks, font, yaxisMin, yaxisMax, yaxisSeq, yaxisScale, format, str, labelY, lab1, lab2, lab3, lab4, lab5)

Example of how it is used in a standard workflow:

graph <- linePlot(data = PERSI.data, title = "", inv.returns = TRUE, treasury = FALSE,
                  interactive = TRUE, Ytitle = NULL,
                  font = "Calibri", yaxisMin = -21, yaxisMax = 21, yaxisSeq = 3,
                  yaxisScale = 100, format = "%", str = 60,
                  labelY = "", 
                  lab1 = "Market Valued Returns (Actual)", 
                  lab2 = "Actuarially Valued Investment Return (Smoothed by Plan)", 
                  lab3 = "Assumed Rate of Return", 
                  lab4 = "10-Year Geometric Rolling Average")

areaPlot()

  1. areaPlot(): creates Mountain of Debt or S&P500 chart. areaPlot has seven arguments, with data being required:

areaPlot(data, title, caption, grid, ticks, sp500, font)

Example of how it is used in a standard workflow:

debt <- areaPlot(PERSI.debt, caption = F, grid = F, ticks = F, sp500 = F, font = "Calibri")

barPlot()

  1. barPlot(): creates bar charts for Compound Gain/Loss, Net Amortization, Stochastic, and ADEC vs. ERCs. barPlot() has 24 arguments, with data not being required for G/L or Stochastic charts:

barplot(data, net.amo, contributions, compound.gl, gl.url, stochastic.alpha, Ymax, Ymin, Ybreak, yaxisScale, interactive, font, lab1, lab2, lab3, lab4, lab5, lab6, lab7, lab8, lab9)

Example of how it is used in a standard workflow:

barPlot <- function(
  data = NULL, 
  net.amo = FALSE, 
  contributions = FALSE, 
  compound.gl = TRUE,
  gl.url = "https://raw.githubusercontent.com/ReasonFoundation/GraphicsR/master/PERSI_GL2.csv",
  str = 25,
  stochastic = FALSE, 
  stochastic.url = "https://raw.githubusercontent.com/ReasonFoundation/databaseR/master/files/perctest2.csv",
  stochastic.alpha = 0.40,
  Ymax = 25,
  Ymin = 0,
  Ybreak = 5,
  yaxisScale = 1/1000000000, 
  interactive = FALSE,
  font = "Calibri",
  lab1 = "Investment Experience", 
  lab2 = "Benefit Changes",
  lab3 = "Changes to Actuarial Methods & Assumptions",
  lab4 = "Deviations from Demographic Assumptions",
  lab5 = "Negative Amortization",
  lab6 = "Gains From Pay Increases Not Given",
  lab7 = "Net Change to Unfunded Liability",
  lab8 = NULL,
  lab9 = NULL)

savePlot()

  1. savePlot(): adds a source and save ggplot chart. savePlot takes five arguments: savePlot(plot = myplot, source = "The source for my data", save_filepath = "filename_that_my_plot_should_be_saved_to.png", width_pixels = 648, height_pixels = 384.48)

  2. plot: The variable name of the plot you have created that you want to format and save

  3. source: The text you want to come after the text 'Source:' in the bottom left hand side of your side
  4. save_filepath: Exact filepath that you want the plot to be saved to
  5. width_pixels: Width in pixels that you want to save your chart to - defaults to 648
  6. height_pixels: Height in pixels that you want to save your chart to - defaults to 384.48
savePlot(debt_plot, source = "Source: KPERS", save_filepath = "output/test.png")

The BBC has created a wonderful data journalism cookbook for R graphics located here: https://bbc.github.io/rcookbook/



ReasonFoundation/pensionviewr documentation built on June 20, 2021, 9:12 p.m.