knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%",
  cache = TRUE
)

postal r emo::ji("mailbox")

Project Status: Active - The project has reached a stable, usable state and is being actively developed. Travis build status Coverage status AppVeyor Build Status CRAN_Status_Badge rstudio mirror downloads

Want an estimate of the price of sending a package somewhere via the US Postal Service? Need to get the USPS shipping zone between two zip codes?

Well, this is a r emo::ji("package") for your r emo::ji("package")s. postal provides a tidy interface to the USPS domestic zone calc and post calc APIs.


im_not_an_owl


Installation

From CRAN:

install.packages("postal")

The development version:

# install.packages("devtools")
devtools::install_github("aedobbyn/postal")

Postage Price Calculator

The single postage calculation function, fetch_mail, works for flat-rate envelopes and boxes (the kind you pick up at the post office and wrestle with until they fold into a box shape) as well as for packages, which vary by their weight and dimensions.

Currently only destinations in the US are supported.

Usage

Specify a 5-digit origin zip and destination zip, along with the date and time you're going to be shipping ("today" and "now" are allowed). Other specifics are optional.

library(postal)

USPS offers many colorful options to handle all your shipping needs, which are included in the arguments to fetch_mail. So to answer the burning question...what if we wanted to ship live animals from Wyoming to Philly by ground on July 2 at 2:30pm in a nonrectangular package??

fluffy <- fetch_mail(origin_zip = "88201", 
                   destination_zip = "19109", 
                   shipping_date = "2018-07-02", 
                   shipping_time = "14:30", 
                   live_animals = TRUE,
                   ground_transportation_needed = TRUE,
                   pounds = 42,
                   ounces = 3,
                   length = 12,
                   width = 10,
                   height = 7,
                   girth = 5,
                   shape = "nonrectangular",
                   verbose = FALSE)

When will it get there and how much will it cost?

fluffy %>% 
  dplyr::pull(delivery_day)

fluffy %>% 
  dplyr::pull(retail_price)

Finally, the important questions have been answered.


General case

For a more usual case, we'll send a 15lb package from Portland, Maine to Portland, Oregon. The response shows all shipping options along with their prices, dimensions, and delivery dates.

(mail <- fetch_mail(origin_zip = "04101",
         destination_zip = "97211",
         shipping_date = "today",
         shipping_time = "now",
         pounds = 15,
         type = "package",
         shape = "rectangular",
         show_details = TRUE)) %>% 
  dplyr::slice(1:3)

mail %>% 
  dplyr::slice(1:3) %>% 
  knitr::kable()

The web interface should display the same results:

post_calc

fetch_mail is a good option if you want to display data in the way USPS does. If you want to compute on prices and dates, you can tidy the dataframe by sending it into scrub_mail.

scrub_mail replaces "Not available"s and empty strings with NAs, changes prices to numeric, splits delivery day into a date and time of day (we infer year by the current year and use the 24hr clock), and computes the delivery duration in days.

mail %>% 
  scrub_mail() %>% 
  dplyr::slice(1:3) %>% 
  dplyr::select(
    delivery_date, delivery_by_time,
    delivery_duration, retail_price, 
    click_n_ship_price, dplyr::everything()
  )


Multiple inputs and error handling

These functions work on a single origin and single destination, but multiple can be mapped into a tidy dataframe. Important parts of the request (origin_zip, destination_zip, shipping_date, and shipping_time) are included in the result, making it easier to distinguish different inputs from one another.

By default we try the API 3 times before giving up. You can modify that by changing n_tries. If after n_tries we still have an error (here, "foo" and "bar" are not good zips), a "no_success" row is returned so that we don't error out on the first failure.

origins <- c("11238", "foo", "60647", "80222")
destinations <- c("98109", "94707", "bar", "04123")

purrr::map2_dfr(
  origins, destinations,
  fetch_mail,
  type = "box",
  n_tries = 3,
  verbose = FALSE
)

Similarly, if a response is received but no mail services are found, a dataframe with missing values is returned.

fetch_mail(origin_zip = "04101",
         destination_zip = "97211",
         shipping_date = "3018-07-04",  # way in the future!
         type = "package",
         show_details = TRUE)

This approach makes takes care of much of the try-catching you might have to implement, with the aim of making it easier to request a lot of data in one go.



Zones

Zones! A zone is a representation of distance between the origin and the destination zip codes. Zones are used in determining postage rates and delivery times.

Sometimes you just need to know the shipping zone between your origin and destination. Or maybe between all origins and all destinations for some app you're building.

That doesn't sound so bad, but there are 99999^2 or 9,999,800,001 possible 5-digit origin-destination zip combinations in the US. The USPS Zone Calc tool narrows down that space a bit by trimming zips to their first 3 digits. Every 5 digit zip's information is defined by its 3-digit prefix, except for 5-digit exceptions, which are noted.

Usage

fetch_zones_three_digit lets you find the zone corresponding to a 3-digit origin zip prefix and one or many 3-digit destination zip prefixes.

fetch_zones_three_digit(origin_zip = "123", 
            destination_zip = "581")

If no destination is supplied, all valid desination zips and zones are returned for the origin.

fetch_zones_three_digit(origin_zip = "321")


Multiple zips

You can provide a vector of zips and map them nicely into a long dataframe. Here we ask for all destination zips for these three origin zips.

If an origin zip is supplied that is not in use, it is messaged and included in the output with NAs in the other columns. For example, the origin "001" is not a valid 3-digit zip prefix.

origin_zips <- c("001", "271", "828")

origin_zips %>% 
  purrr::map_dfr(fetch_zones_three_digit)

Similarly, map over both origin and destination zips and end up at a dataframe. verbose gives you a play-by-play if you want it. (More on auto-prepending leading 0s to input zips in the On Digits section below.)

dest_zips <- c("867", "53", "09")

purrr::map2_dfr(origin_zips, dest_zips, 
                fetch_zones_three_digit,
                verbose = TRUE)



Ranges and other features

The USPS zone calc web interface displays zones only as they pertain to destination zip code ranges:

zone_calc


If you prefer the range representation, you can set as_range = TRUE. Instead of a dest_zip column, you'll get a marker of the beginning of and end of the range in dest_zip_start and dest_zip_end.

fetch_zones_three_digit("42", "42",
            as_range = TRUE)


Details

You can optionally display other details about the zips, zones, and type of postage the zone designation applies to.

fetch_zones_three_digit(origin_zip = "404",
            show_details = TRUE)  

Definitions of these details can be found in zone_detail_definitions.

zone_detail_definitions %>% 
  knitr::kable() 


On Digits

The API endpoint used in fetch_zones_three_digit accepts exactly 3 digits for the origin zip; it mostly returns 3 digit destination zips, but also some 5 digit exceptions. For that reason,

For example, when a 5-digit destination is supplied and exact_destination is FALSE, we include results for the destination 962 as well as for the exact one supplied, 96240.

fetch_zones_three_digit(origin_zip = "12358132134558", 
            destination_zip = "96240",
            exact_destination = FALSE)     

When exact_destination is TRUE, we filter only to 96240, which is a 5 digit exception as its zone is different from its 3-digit prefix's.

fetch_zones_three_digit(origin_zip = "12358132134558", 
            destination_zip = "96240",
            exact_destination = TRUE)  


I just want to supply 5 digits

fetch_zones_three_digit should cover most 5 digit cases and supply the most information when show_details is TRUE. But if you just want to use the equivalent of the "Get Zone for ZIP Code Pair" tab, you can use fetch_zones_five_digit.

fetch_zones_five_digit("31415", "92653")

Details given when show_details = TRUE in fetch_zones_five_digit are slightly different than they are for fetch_zones_three_digit (see Details).


All of the data

If you want the most up-to-date zip-zone mappings, fetch_all allows you to use the 3 digit endpoint to fetch all possible origins and, optionally, write them to a CSV as you go.

By default we use every possible origin from "000" to "999"; as of now "000" through "004" are all not in use along with a smattering of others like "404" and "867" -- but who knows, they might be used in the future.

fetch_all(all_possible_origins,
          sleep_time = 0.5,   # How long to sleep in between requests, on average
          write_to = "path/to/my/file.csv")

If there's a network error when grabbing a zip, we back off and try a few times and finally write "no_success" (rather than NAs which indicate that the origin zip is not in use) in the destination zip columns.

What that looks like in the event we switch on the internet between asking for origin "456" and origin "789":

tibble::tribble(
  ~"origin_zip", ~"dest_zip", ~"zone",
  "123", "no_success", "no_success",
  "456", "no_success", "no_success",
  "789", "005", "7",
  "789", "006", "8",
  "789", "007", "8",
  "789", "008", "8",
  "789", "009", "8",
  "789", "010", "7",
  "...", "...", "..."
)

The entire set is also made available from a read-only MySQL database, which you can connect to with these creds:

host: knotsql.cimbccxns4ka.us-east-2.rds.amazonaws.com
port: 3306
database: master
user: public
password: password


Or some of it, for free

Free as in even less effort than the free as in beer stuff up there.

The zips_zones_sample dataset included in this package contains a random sample of 1,000,000 rows of all the 3 digit origin-destination pairs. Load it with:

data(zips_zones_sample)

It's what you'd get by running fetch_all(show_details = TRUE), waiting a while, and then taking a sample.

zips_zones_sample

The sample is about a quarter of the total number of rows between all origin prefixes and all destination prefixes, plus the 5 digit exceptions (~4m rows). See it put to use in the vignette.


That's it! Bug reports and PRs welcome! r emo::ji("mailbox_with_mail")

catch_mailman



aedobbyn/postal documentation built on May 21, 2019, 6:18 a.m.