Geocode an address vector using the Google Maps API.

Share:

Description

geocode_url uses the Google Maps API to estimate latitude and longitude coordinates for a character vector of physical addresses. Optionally, one may use their (paid) Google for Work/Premium API key to sign the request with the hmac sha1 algorithm. For smaller batch requests, it is also possible to access Google's "standard API" using this function (see this page to obtain a free API key).

Usage

1
2
3
geocode_url(address, auth = "standard_api", privkey = NULL,
  clientid = NULL, clean = FALSE, verbose = FALSE, add_date = "none",
  messages = FALSE, dryrun = FALSE)

Arguments

address

A 1xN vector of address(es) with "url-safe" characters. Enabling the "clean" parameter calls the address_cleaner function, which strips or replaces common characters that are incompatible with the Maps API. Notes:

  • Addresses should be in raw form, not URL encoded (e.g., of the form: 123 Main Street, Somewhere, NY 12345, USA).

  • Specifying the country is optional but recommended.

auth

character string; one of: "standard_api" (the default) or "work". Although you may specify an empty string for this parameter (see the examples below), we recommend users obtain a (free) standard API key: Google API key. Authentication via the "work" method requires the client ID and private API key associated with your (paid) Google for Work/Premium account.

privkey

character string; your Google API key (whether of the "work" or "standard_api" variety).

clientid

character string; your Google for Work/Premium Account client ID (generally, these are of the form 'gme-[company]') This parameter should not be set when authenticating through the standard API.

clean

logical; when TRUE, applies address_cleaner to the address vector prior to URL encoding.

verbose

logical; when TRUE, displays additional output in the returns from Google.

add_date

character string; one of: "none" (the default), "today", or "fuzzy". When set to "today", a column with today's calendar date is added to the returned data frame. When set to "fuzzy" a random positive number of days between 1 and 30 is added to this date column. "Fuzzy" date values can be useful to avoid sending large batches of geocode requests on the same day if your scripts recertify/retry geocode estimations after a fixed period of time.

messages

logical; when TRUE, displays warning and error messages generated by the API calls within the pull_geo_data function (e.g. connection errors, malformed signatures, etc.)

dryrun

logical; when TRUE, aborts script prior to the pull_geo_data url call, returning the URL to be encoded. This can be useful for debugging addresses that yield non-conformant JSON returns.

Value

Geocode_url returns a data frame with (numeric) lat/long coordinates and four additional parameters from the response object (see this page for additional information):

  • formatted_address: The formatted address Google used to estimate the geocordinates.

  • location_type: An estimate of the response object's coordinate accuracy. Currently, possible response values are:

    • ROOFTOP: indicates that the return is accurate to the level of a precise street address.

    • RANGE_INTERPOLATED: indicates that the result reflects an approximation (usually on a road) interpolated between two precise points (such as intersections). Interpolated results are generally returned when rooftop geocodes are unavailable for a street address.

    • GEOMETRIC_CENTER: indicates that the result is the geometric center of a result such as a polyline (for example, a street) or polygon (region).

    • APPROXIMATE: indicates that the result is approximate.

  • status: The geocode status of a response object. Currently, possible response values are:

    • OK: indicates that no errors occurred; the address was successfully parsed and at least one geocode was returned.

    • ZERO_RESULTS: indicates that the geocode was successful but returned no results. This may occur if the geocoder was passed a non-existent address.

    • OVER_QUERY_LIMIT: indicates that you are over your quota.

    • REQUEST_DENIED: indicates that your request was denied.

    • INVALID_REQUEST: Indicates that some part of the query (address, URL components, etc.) is missing.

    • UNKNOWN_ERROR: indicates that the request could not be processed due to a server error. The request may succeed if you try again.

    • INVALID_SIGNATURE: This response is generated by error-handling within the placement package. For Google for Work requests, this error (usually) indicates that the signature associated with the geocode request was invalid.

    • CONNECTION_ERROR: This status is generated by the package's internal error-handling, and suggests a connection error orcurred while fetching the url(s)(e.g. due to intermittent internet connectivity, problems with the Google maps servers, etc.).

  • error_message: Any error messages associated with the API call (e.g. connection timeouts, signature errors, etc.).

  • locations: character; the user supplied values in the address parameter (this is returned for matching/verification).

  • input_url: character; the full url associated with the response object.

  • address: character; the user supplied physical address (prior to Google's formatting).

Examples

1
2
3
4
5
6
7
8
9
# Get coordinates for the Empire State Building and Google
address <- c("350 5th Ave, New York, NY 10118, USA",
			 "1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA")

coordset <- geocode_url(address, auth="standard_api", privkey="",
            clean=TRUE, add_date='today', verbose=TRUE)

# View the returns
print(coordset[ , 1:5])