knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%", cache = TRUE )
r emo::ji("mailbox")
Want an estimate of the price of sending a package somewhere via the US Postal Service? Need to get the USPS shipping zone between two zip codes?
Well, this is a r emo::ji("package")
for your r emo::ji("package")
s. postal
provides a tidy interface to the USPS domestic zone calc and post calc APIs.
From CRAN:
install.packages("postal")
The development version:
# install.packages("devtools") devtools::install_github("aedobbyn/postal")
The single postage calculation function, fetch_mail
, works for flat-rate envelopes and boxes (the kind you pick up at the post office and wrestle with until they fold into a box shape) as well as for packages, which vary by their weight and dimensions.
Currently only destinations in the US are supported.
Specify a 5-digit origin zip and destination zip, along with the date and time you're going to be shipping ("today"
and "now"
are allowed). Other specifics are optional.
library(postal)
USPS offers many colorful options to handle all your shipping needs, which are included in the arguments to fetch_mail
. So to answer the burning question...what if we wanted to ship live animals from Wyoming to Philly by ground on July 2 at 2:30pm in a nonrectangular package??
fluffy <- fetch_mail(origin_zip = "88201", destination_zip = "19109", shipping_date = "2018-07-02", shipping_time = "14:30", live_animals = TRUE, ground_transportation_needed = TRUE, pounds = 42, ounces = 3, length = 12, width = 10, height = 7, girth = 5, shape = "nonrectangular", verbose = FALSE)
When will it get there and how much will it cost?
fluffy %>% dplyr::pull(delivery_day) fluffy %>% dplyr::pull(retail_price)
Finally, the important questions have been answered.
For a more usual case, we'll send a 15lb package from Portland, Maine to Portland, Oregon. The response shows all shipping options along with their prices, dimensions, and delivery dates.
(mail <- fetch_mail(origin_zip = "04101", destination_zip = "97211", shipping_date = "today", shipping_time = "now", pounds = 15, type = "package", shape = "rectangular", show_details = TRUE)) %>% dplyr::slice(1:3) mail %>% dplyr::slice(1:3) %>% knitr::kable()
The web interface should display the same results:
fetch_mail
is a good option if you want to display data in the way USPS does. If you want to compute on prices and dates, you can tidy the dataframe by sending it into scrub_mail
.
scrub_mail
replaces "Not available"
s and empty strings with NA
s, changes prices to numeric, splits delivery day into a date and time of day (we infer year by the current year and use the 24hr clock), and computes the delivery duration in days.
mail %>% scrub_mail() %>% dplyr::slice(1:3) %>% dplyr::select( delivery_date, delivery_by_time, delivery_duration, retail_price, click_n_ship_price, dplyr::everything() )
These functions work on a single origin and single destination, but multiple can be mapped into a tidy dataframe. Important parts of the request (origin_zip
, destination_zip
, shipping_date
, and shipping_time
) are included in the result, making it easier to distinguish different inputs from one another.
By default we try the API 3 times before giving up. You can modify that by changing n_tries
. If after n_tries
we still have an error (here, "foo"
and "bar"
are not good zips), a "no_success"
row is returned so that we don't error out on the first failure.
origins <- c("11238", "foo", "60647", "80222") destinations <- c("98109", "94707", "bar", "04123") purrr::map2_dfr( origins, destinations, fetch_mail, type = "box", n_tries = 3, verbose = FALSE )
Similarly, if a response is received but no mail services are found, a dataframe with missing values is returned.
fetch_mail(origin_zip = "04101", destination_zip = "97211", shipping_date = "3018-07-04", # way in the future! type = "package", show_details = TRUE)
This approach makes takes care of much of the try-catching you might have to implement, with the aim of making it easier to request a lot of data in one go.
Zones! A zone is a representation of distance between the origin and the destination zip codes. Zones are used in determining postage rates and delivery times.
Sometimes you just need to know the shipping zone between your origin and destination. Or maybe between all origins and all destinations for some app you're building.
That doesn't sound so bad, but there are 99999^2
or 9,999,800,001 possible 5-digit origin-destination zip combinations in the US. The USPS Zone Calc tool narrows down that space a bit by trimming zips to their first 3 digits. Every 5 digit zip's information is defined by its 3-digit prefix, except for 5-digit exceptions, which are noted.
fetch_zones_three_digit
lets you find the zone corresponding to a 3-digit origin zip prefix and one or many 3-digit destination zip prefixes.
fetch_zones_three_digit(origin_zip = "123", destination_zip = "581")
If no destination is supplied, all valid desination zips and zones are returned for the origin.
fetch_zones_three_digit(origin_zip = "321")
You can provide a vector of zips and map them nicely into a long dataframe. Here we ask for all destination zips for these three origin zips.
If an origin zip is supplied that is not in use, it is messaged and included in the output with NA
s in the other columns. For example, the origin "001"
is not a valid 3-digit zip prefix.
origin_zips <- c("001", "271", "828") origin_zips %>% purrr::map_dfr(fetch_zones_three_digit)
Similarly, map over both origin and destination zips and end up at a dataframe. verbose
gives you a play-by-play if you want it. (More on auto-prepending leading 0s to input zips in the On Digits section below.)
dest_zips <- c("867", "53", "09") purrr::map2_dfr(origin_zips, dest_zips, fetch_zones_three_digit, verbose = TRUE)
The USPS zone calc web interface displays zones only as they pertain to destination zip code ranges:
If you prefer the range representation, you can set as_range = TRUE
. Instead of a dest_zip
column, you'll get a marker of the beginning of and end of the range in dest_zip_start
and dest_zip_end
.
fetch_zones_three_digit("42", "42", as_range = TRUE)
You can optionally display other details about the zips, zones, and type of postage the zone designation applies to.
fetch_zones_three_digit(origin_zip = "404", show_details = TRUE)
Definitions of these details can be found in zone_detail_definitions
.
zone_detail_definitions %>% knitr::kable()
The API endpoint used in fetch_zones_three_digit
accepts exactly 3 digits for the origin zip; it mostly returns 3 digit destination zips, but also some 5 digit exceptions. For that reason,
"8"
becomes "008"
exact_destination
flag determines whether we results for the that destination's 3-digit prefix filter or filter to only the exact 5-digit destinationFor example, when a 5-digit destination is supplied and exact_destination
is FALSE
, we include results for the destination 962
as well as for the exact one supplied, 96240
.
fetch_zones_three_digit(origin_zip = "12358132134558", destination_zip = "96240", exact_destination = FALSE)
When exact_destination
is TRUE
, we filter only to 96240
, which is a 5 digit exception as its zone is different from its 3-digit prefix's.
fetch_zones_three_digit(origin_zip = "12358132134558", destination_zip = "96240", exact_destination = TRUE)
fetch_zones_three_digit
should cover most 5 digit cases and supply the most information when show_details
is TRUE
.
But if you just want to use the equivalent of the "Get Zone for ZIP Code Pair" tab, you can use fetch_zones_five_digit
.
fetch_zones_five_digit("31415", "92653")
Details given when show_details = TRUE
in fetch_zones_five_digit
are slightly different than they are for fetch_zones_three_digit
(see Details).
If you want the most up-to-date zip-zone mappings, fetch_all
allows you to use the 3 digit endpoint to fetch all possible origins and, optionally, write them to a CSV as you go.
By default we use every possible origin from "000"
to "999"
; as of now "000"
through "004"
are all not in use along with a smattering of others like "404"
and "867"
-- but who knows, they might be used in the future.
fetch_all(all_possible_origins, sleep_time = 0.5, # How long to sleep in between requests, on average write_to = "path/to/my/file.csv")
If there's a network error when grabbing a zip, we back off and try a few times and finally write "no_success"
(rather than NA
s which indicate that the origin zip is not in use) in the destination zip columns.
What that looks like in the event we switch on the internet between asking for origin "456"
and origin "789"
:
tibble::tribble( ~"origin_zip", ~"dest_zip", ~"zone", "123", "no_success", "no_success", "456", "no_success", "no_success", "789", "005", "7", "789", "006", "8", "789", "007", "8", "789", "008", "8", "789", "009", "8", "789", "010", "7", "...", "...", "..." )
The entire set is also made available from a read-only MySQL database, which you can connect to with these creds:
host: knotsql.cimbccxns4ka.us-east-2.rds.amazonaws.com port: 3306 database: master user: public password: password
Free as in even less effort than the free as in beer stuff up there.
The zips_zones_sample
dataset included in this package contains a random sample of 1,000,000 rows of all the 3 digit origin-destination pairs. Load it with:
data(zips_zones_sample)
It's what you'd get by running fetch_all(show_details = TRUE)
, waiting a while, and then taking a sample.
zips_zones_sample
The sample is about a quarter of the total number of rows between all origin prefixes and all destination prefixes, plus the 5 digit exceptions (~4m rows). See it put to use in the vignette.
That's it! Bug reports and PRs welcome! r emo::ji("mailbox_with_mail")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.