knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(dplyr) library(implan)
The implan package streamlines several steps otherwise done in Excel:
implan::check
functions.implan::input()
implan::output()
R code can be easily scaled up/down or ported to a new project without the manual restructuring needed in Excel. It also facilitates automated checks/summaries. You can see a production example for B4W-19-01: implan-input, contributions
There is an Office 365 group with useful files (O365 > Implan) including master sectoring schemes (i.e., category to sector crosswalks). I've also included template files from the Excel approach in this package (details in the last section).
Allocating spending to Implan sectors requires two crosswalk tables ( item_to_category
, category_to_sector
). The starting point is a table of total spending estimates by activity-type-item:
library(dplyr) library(implan) data(spending) # spending by activity-type-item spending
Spending must first be reallocated from the item level (e.g., from survey data) to the category level (e.g., "Food - Groceries", etc.) using an item-to-category crosswalk:
data(item_to_category) head(item_to_category, 2)
Items in the example data mostly have a 1-to-1 correspondence with categories, but several need to be split into multiple categories. The crosswalk is specified by three dimensions (activity_group, type, item) across which share must sum to 100%. We can confirm this using check_share_sums()
:
# check the share variable - should print "TRUE" check_share_sums(df = item_to_category, sharevar = share, activity_group, type, item)
Allocating is a simple matter of joining the spending and crosswalk tables and then multiplying:
spend_category <- spending %>% left_join(item_to_category, by = c("activity_group", "type", "item")) %>% mutate(spend = spend * share) %>% select(-share) # no longer needed head(spend_category, 2)
We can use check_spend_sums()
to confirm that expenditures still sum to the correct quantities after we modify the data:
# check the spending allocation - should print "TRUE" check_spend_sums(df_old = spending, df_new = spend_category, spendvar = spend, activity_group, type, item)
In the same way, allocating to sectors uses a category-to-sector crosswalk (also referred to as an implan sectoring scheme). Note that dated versions of sectoring schemes are stored in O365 > Implan > Sectoring Schemes) and these can be used directly (or with minor modifications) for any given project.
data(category_to_sector546) head(category_to_sector546, 2) check_share_sums(category_to_sector546, sharevar = share, category)
spend_sector <- spend_category %>% left_join(category_to_sector546, by = "category") %>% mutate(spend = spend * share) check_spend_sums(df_old = spend_category, df_new = spend_sector, spendvar = spend, category)
Writing to Excel (for Implan import) can be done with the input()
function. This function combines two steps that are worth distinguishing:
input_prep()
converts spending by sector to the two tables (header, data) needed in each Excel sheetind <- input_prep_ind(spend_sector, activity_name = "Ind", event_year = 2019) comm <- input_prep_comm(spend_sector, activity_name = "Comm", event_year = 2019) comm$header head(comm$dat, 2)
input_write()
adds the prepared data to an Excel fileinput_write(comm, "tmp.xlsx") input_write(ind, "tmp.xlsx") openxlsx::getSheetNames("tmp.xlsx")
It's convenient to wrap the prep/write steps into a single function for production. To be on the safe side, we can also run check_implan_sums()
to ensure the Excel data look correct. Note that for Implan import, you'll first need to open the .xlsx
file and save to the legacy .xls
format which Implan requires.
input(spend_sector, "tmp2.xlsx", 2019) check_implan_sums(spend_sector, "tmp2.xlsx") # should print TRUE openxlsx::getSheetNames("tmp2.xlsx")
The input()
function also allows grouping at an arbitrary number of dimensions, which get written to separate Excel sheets. For example, we could split the results by act
and type
:
input(spend_sector, "tmp3.xlsx", 2019, act, type) check_implan_sums(spend_sector, "tmp3.xlsx", act, type) openxlsx::getSheetNames("tmp3.xlsx")
In practice we might want to delineate one of these dimensions with separate Excel files. That's easily accomplished using a for loop:
for (i in c("equip", "trip")) { dat <- filter(spend_sector, type == i) filename <- paste0("tmp-", i, ".xlsx") input(dat, filename, 2019, act) check_implan_sums(dat, filename, act) %>% print() } openxlsx::getSheetNames("tmp-trip.xlsx")
From Implan, you'll need to save output results into csv files, where each Implan activity should have its own folder of results. Two example activities are included in this package to demonstrate:
output_dir <- system.file("extdata", "output", "region1", package = "implan") list.files(output_dir)
Typically, you'll have five sets of results per activity:
hunt_dir <- file.path(output_dir, "hunt") list.files(hunt_dir)
We can use the output()
function to pull the data, but I'll break out its two subprocesses for illustration:
output_read_csv()
pulls all csv files for an activity into an R list, with one data frame per input file. The names of the list correspond to a title row that Implan includes in the output csv filesdat <- output_read_csv(hunt_dir) names(dat)
output_combine()
aggregates the tax results and appends to the overall summaryoutput_combine(dat)
In practice we'll usually want to pull multiple activities. This is easily accomplished with output()
because it will parse out the directories and store their names in a "dimension" variable:
list.files(output_dir) df <- output(output_dir) glimpse(df)
The output()
function also arbitrarily scales up to multiple dimensions. The example data actually includes two dimensions (region by activity) which we can use to demonstrate:
output_dir <- system.file("extdata", "output", package = "implan") list.files(output_dir) df <- output(output_dir) count(df, dimension)
It's probably more useful to have the dimensions represented by distinct variables:
library(tidyr) df <- separate(df, dimension, c("region", "act")) count(df, region, act)
The Excel approach combines steps 1 & 2 into an Excel workbook. I included a basic example (for wildlife watching) in this package:
filepath <- system.file( "extdata", "templates", "fhwar-implan-import.xls", package = "implan" ) file.copy(filepath, "tmp-input.xls") # copy to your working directory
Step 3 can be accomplished by copy/pasting implan results into an Excel sheet and running embedded macros:
filepath <- system.file( "extdata", "templates", "implan-output.xlsm", package = "implan" ) file.copy(filepath, "tmp-output.xlsm")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.