Description Usage Arguments Details Value Examples
View source: R/address_cleaner.R
Performs character transformations on a vector of addresses in order to build "web-safe" URLs for the Google API.
1 | address_cleaner(address, verbose = TRUE)
|
address |
A raw 1xN vector of UTF-8 encoded addresses. Note: these addresses should be in raw form, not URL encoded (e.g., of the form: 123 Main Street, Somewhere, NY 12345 USA)(country is optional but recommended). |
verbose |
Displays additional progress output |
This function strips character values from a vector of addresses (e.g., a vector of the form: address, city, state, postal code, country) that may inhibit sucessful geocoding with the Google maps API. Specifically, address_cleaner:
Replaces non-breaking spaces with " "
Removes ASCII control characters (001-031 and 177)
Trims runs of spaces and spaces which begin/end a string
Converts special addressing characters, such as ordinals
Removes single/double quotes and asterisks
Strips latin1 characters
Removes leading, trailing, and repeated commas
Removes various permutations of the "c/o" flag
Note: Hypthenated addresses and zipcodes can cause issues with the Maps API. Therefore, prior to applying this function and attempting to geocode a location, we recommend:
Deleting the second half of a compound US postal code, e.g. gsub("(?<=\d)-.*", "", "12345-1234", perl=TRUE)
Replacing hypthenated street numbers with a space followed by a pound, e.g. gsub("(?<=\d)-(?=\d)", " #", "1234-3332 West 100th", perl=T)
.
Both of these transformations, of course, presuppose that your postal code and street numbers exist in separate columns. Similarly, you may want to recode any "CA" country code fields to "Canada" to avoid inaccurate geocoding within California state (this is more more likely to occur when a Canadian address has non-standard features, such as 'c/o' or 'attn' fields, etc.).
address_cleaner returns a character vector of addresses of the same length as the input.
1 2 3 4 5 6 | # Define an incompatible vector of addresses
address <- c(" 350 Fifth Ave \u00bd, New York, NY 10118, USA ",
" \u00ba 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA")
# View the return:
address_cleaner(address)
|
* Replacing non-breaking spaces
* Removing control characters
* Removing leading/trailing spaces, and runs of spaces
* Transliterating latin1 characters
* Converting special address markers
* Removing all remaining non-ASCII characters
* Remove single/double quotes and asterisks
* Removing leading, trailing, and repeated commas
* Removing various c/o string patterns
[1] "350 Fifth Ave 1/2, New York, NY 10118, USA"
[2] "o 1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA"
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.