README.md

RateParser

Build Status Coverage Status

RateParser is an R package to parse files written using the Open Water Rate Specification (OWRS) and use them to calculate water bills.

Installation

To install the latest version from Github, sinply run the following from an R console:

if (!require("devtools"))
  install.packages("devtools")
devtools::install_github("California-Data-Collaborative/RateParser")

Getting Started

This section demonstrates how to apply RateParser to calculate water bills given a dataframe of publicly available billing data from the City of Santa Monica.

First we load the RateParser package and read in the example OWRS file. The example OWRS file for the city of Santa Monica can be downloaded directly from this link (right-click, Save as...) or can be found in the examples directory if this repository is downloaded or cloned.

library(RateParser)

# read in example OWRS file
owrs_file <- read_owrs_file("examples/smc-2016-03-01.owrs")

# view residential single-family rates
owrs_file$rate_structure$RESIDENTIAL_SINGLE
## $tier_starts
## [1]   0  15  41 149
## 
## $tier_prices
## [1]  2.87  4.29  6.44 10.07
## 
## $commodity_charge
## [1] "Tiered"
## 
## $bill
## [1] "commodity_charge"

Not all of the data columns needed to calculate their water bills are included in the public data, so instead we need to assign some default values.

santamonica$meter_size <- '5/8"' 
santamonica$water_type <- 'POTABLE'

Our dataframe currently contains an "OTHER" class (a byproduct of the author's inability to properly classify some of the rate codes). Our sample OWRS file, on the other hand, contains no information for "OTHER" customer class, so we need to filter those out.

Finally we can pass our dataframe and our OWRS file as inputs into the calculate_bill function.

library(dplyr, warn.conflicts = FALSE)

# filter out "OTHER" class
filtered <- dplyr::tbl_df(santamonica) %>% dplyr::filter(cust_class != "OTHER")

# calculate water bills
calced <- calculate_bill(filtered, owrs_file)

The results in a number of additional columns being appended to the original dataframe. Note that the first 9 columns are the original input columns, while the rest have been added by calculate_bills.

In particular,

glimpse(calced)
## Observations: 217,256
## Variables: 21
## $ cust_id          (int) 25886, 74585, 57854, 58210, 25002, 68393, 257...
## $ usage_ccf        (dbl) 388, 9, 73, 22, 11, 84, 670, 170, 0, 102, 107...
## $ usage_month      (int) 3, 1, 1, 1, 1, 3, 3, 3, 3, 3, 3, 3, 3, 1, 4, ...
## $ usage_year       (int) 2014, 2015, 2015, 2015, 2015, 2015, 2015, 201...
## $ cust_class       (chr) "COMMERCIAL", "COMMERCIAL", "COMMERCIAL", "CO...
## $ usage_date       (chr) "2014-03-01", "2015-01-01", "2015-01-01", "20...
## $ rate_code        (chr) "WANR5", "WANR2", "WANR3", "WANR1", "WANR2", ...
## $ meter_size       (chr) "5/8\"", "5/8\"", "5/8\"", "5/8\"", "5/8\"", ...
## $ water_type       (chr) "POTABLE", "POTABLE", "POTABLE", "POTABLE", "...
## $ tier_starts      (chr) "0\n211", "0\n211", "0\n211", "0\n211", "0\n2...
## $ tier_prices      (chr) "4.07\n10.03", "4.07\n10.03", "4.07\n10.03", ...
## $ X1               (dbl) 210, 9, 73, 22, 11, 84, 210, 170, 0, 102, 210...
## $ X2               (dbl) 178, 0, 0, 0, 0, 0, 460, 0, 0, 0, 865, 0, 0, ...
## $ XR1              (dbl) 854.70, 36.63, 297.11, 89.54, 44.77, 341.88, ...
## $ XR2              (dbl) 1785.34, 0.00, 0.00, 0.00, 0.00, 0.00, 4613.8...
## $ commodity_charge (dbl) 2640.04, 36.63, 297.11, 89.54, 44.77, 341.88,...
## $ bill             (dbl) 2640.04, 36.63, 297.11, 89.54, 44.77, 341.88,...
## $ X3               (dbl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
## $ X4               (dbl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
## $ XR3              (dbl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...
## $ XR4              (dbl) NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N...


California-Data-Collaborative/RateParser documentation built on May 6, 2019, 9:27 a.m.