knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(dplyr)
library(implan)

Overview

The implan package streamlines several steps otherwise done in Excel:

  1. Prepare: Allocate spending by Implan sector using crosswalks and validate using implan::check functions.
  2. Input: Write Excel sheets for Implan import using implan::input()
  3. Output: Pull Implan output (csv) into a table using implan::output()

Elevator Pitch

R code can be easily scaled up/down or ported to a new project without the manual restructuring needed in Excel. It also facilitates automated checks/summaries. You can see a production example for B4W-19-01: implan-input, contributions

Reference

There is an Office 365 group with useful files (O365 > Implan) including master sectoring schemes (i.e., category to sector crosswalks). I've also included template files from the Excel approach in this package (details in the last section).

1. Prepare Implan Sector Allocation

Allocating spending to Implan sectors requires two crosswalk tables ( item_to_category, category_to_sector). The starting point is a table of total spending estimates by activity-type-item:

library(dplyr)
library(implan)

data(spending) # spending by activity-type-item
spending

Item to Category

Spending must first be reallocated from the item level (e.g., from survey data) to the category level (e.g., "Food - Groceries", etc.) using an item-to-category crosswalk:

data(item_to_category)
head(item_to_category, 2)

Items in the example data mostly have a 1-to-1 correspondence with categories, but several need to be split into multiple categories. The crosswalk is specified by three dimensions (activity_group, type, item) across which share must sum to 100%. We can confirm this using check_share_sums():

# check the share variable - should print "TRUE"
check_share_sums(df = item_to_category, sharevar = share, activity_group, type, item)

Allocating is a simple matter of joining the spending and crosswalk tables and then multiplying:

spend_category <- spending %>%
    left_join(item_to_category, by = c("activity_group", "type", "item")) %>%
    mutate(spend = spend * share) %>%
    select(-share) # no longer needed

head(spend_category, 2)

We can use check_spend_sums() to confirm that expenditures still sum to the correct quantities after we modify the data:

# check the spending allocation - should print "TRUE"
check_spend_sums(df_old = spending, df_new = spend_category, spendvar = spend, 
                 activity_group, type, item)

Category to Sector

In the same way, allocating to sectors uses a category-to-sector crosswalk (also referred to as an implan sectoring scheme). Note that dated versions of sectoring schemes are stored in O365 > Implan > Sectoring Schemes) and these can be used directly (or with minor modifications) for any given project.

data(category_to_sector546)
head(category_to_sector546, 2)

check_share_sums(category_to_sector546, sharevar = share, category)
spend_sector <- spend_category %>%
    left_join(category_to_sector546, by = "category") %>%
    mutate(spend = spend * share)

check_spend_sums(df_old = spend_category, df_new = spend_sector, spendvar = spend, category)

2. Input Write to Excel

Writing to Excel (for Implan import) can be done with the input() function. This function combines two steps that are worth distinguishing:

ind <- input_prep_ind(spend_sector, activity_name = "Ind", event_year = 2019)
comm <- input_prep_comm(spend_sector, activity_name = "Comm", event_year = 2019)

comm$header

head(comm$dat, 2)
input_write(comm, "tmp.xlsx")
input_write(ind, "tmp.xlsx")

openxlsx::getSheetNames("tmp.xlsx")

It's convenient to wrap the prep/write steps into a single function for production. To be on the safe side, we can also run check_implan_sums() to ensure the Excel data look correct. Note that for Implan import, you'll first need to open the .xlsx file and save to the legacy .xls format which Implan requires.

input(spend_sector, "tmp2.xlsx", 2019)
check_implan_sums(spend_sector, "tmp2.xlsx") # should print TRUE

openxlsx::getSheetNames("tmp2.xlsx")

The input() function also allows grouping at an arbitrary number of dimensions, which get written to separate Excel sheets. For example, we could split the results by act and type:

input(spend_sector, "tmp3.xlsx", 2019, act, type)
check_implan_sums(spend_sector, "tmp3.xlsx", act, type)

openxlsx::getSheetNames("tmp3.xlsx")

In practice we might want to delineate one of these dimensions with separate Excel files. That's easily accomplished using a for loop:

for (i in c("equip", "trip")) {
    dat <- filter(spend_sector, type == i)
    filename <- paste0("tmp-", i, ".xlsx")
    input(dat, filename, 2019, act)
    check_implan_sums(dat, filename, act) %>% print()
}

openxlsx::getSheetNames("tmp-trip.xlsx")

3. Output Read from Implan

From Implan, you'll need to save output results into csv files, where each Implan activity should have its own folder of results. Two example activities are included in this package to demonstrate:

output_dir <- system.file("extdata", "output", "region1", package = "implan")
list.files(output_dir)

Typically, you'll have five sets of results per activity:

hunt_dir <- file.path(output_dir, "hunt")
list.files(hunt_dir)

output()

We can use the output() function to pull the data, but I'll break out its two subprocesses for illustration:

dat <- output_read_csv(hunt_dir)
names(dat)
output_combine(dat)

In practice we'll usually want to pull multiple activities. This is easily accomplished with output() because it will parse out the directories and store their names in a "dimension" variable:

list.files(output_dir)

df <- output(output_dir)
glimpse(df)

The output() function also arbitrarily scales up to multiple dimensions. The example data actually includes two dimensions (region by activity) which we can use to demonstrate:

output_dir <- system.file("extdata", "output", package = "implan")
list.files(output_dir)

df <- output(output_dir)
count(df, dimension)

It's probably more useful to have the dimensions represented by distinct variables:

library(tidyr)
df <- separate(df, dimension, c("region", "act"))
count(df, region, act)

Excel Approach

The Excel approach combines steps 1 & 2 into an Excel workbook. I included a basic example (for wildlife watching) in this package:

filepath <- system.file(
  "extdata", "templates", "fhwar-implan-import.xls", package = "implan"
)
file.copy(filepath, "tmp-input.xls") # copy to your working directory

Step 3 can be accomplished by copy/pasting implan results into an Excel sheet and running embedded macros:

filepath <- system.file(
  "extdata", "templates", "implan-output.xlsm", package = "implan"
)
file.copy(filepath, "tmp-output.xlsm")


southwick-associates/implan documentation built on Feb. 28, 2020, 10:45 a.m.