parse_country: Parse country names to standardized form

Description Usage Arguments Details Value Examples

View source: R/parse_country.R

Description

parse_country parses irregular country names to the ISO 3166-1 Alpha-2 code or other standardized code or name format.

Usage

1
2
3
4
5
6
7
parse_country(
  x,
  to = "iso2c",
  how = c("regex", "google"),
  language = c("en", "de"),
  factor = is.factor(x)
)

Arguments

x

A character or factor vector of country names to standardize

to

Format to which to convert. Defaults to "iso2c"; see codes for more options.

how

How to parse; defaults to "regex". '"google"“ uses the Google Maps geocoding API. See "Details" for more information.

language

If how = "regex", the language from which to parse country names. Currently accepts "en" (default) and "de". Ignored if how = "google".

factor

If TRUE, returns factor instead of character vector. If not supplied, defaults to is.factor(x)

Details

parse_country tries to parse a character or factor vector of country names to a standardized form: by default, ISO 3166-1 Alpha-2 codes.

When how = "regex" (default), parse_country uses regular expressions to match irregular forms.

If regular expressions are insufficient, how = "google" will use the Google Maps geocoding API instead, which permits a much broader range of input formats and languages. The API allows 2500 calls per day, and should thus be called judiciously. parse_country will make one call per unique input. For more calls, see options that allow passing an API key like ggmap::geocode() with output = "all" or googleway::google_geocode().

Note that due to their flexibility, the APIs may fail unpredictably, e.g. parse_country("foo", how = "google") returns "CH" whereas how = "regex" fails with a graceful NA and warning.

Value

A character vector or factor of ISO 2-character country codes or other specified codes or names. Warns of any parsing failure.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
parse_country(c("United States", "USA", "U.S.", "us", "United States of America"))

## Not run: 
# Unicode support for parsing accented or non-Latin scripts
parse_country(c("\u65e5\u672c", "Japon", "\u0698\u0627\u067e\u0646"), how = "google")
#> [1] "JP" "JP" "JP" "JP"

# Parse distinct place names via geocoding APIs
parse_country(c("1600 Pennsylvania Ave, DC", "Eiffel Tower"), how = "google")
#> [1] "US" "FR"

## End(Not run)

passport documentation built on Nov. 8, 2020, 4:28 p.m.