knitr::opts_chunk$set(collapse = T, comment = "#>") options(tibble.print_min = 4L, tibble.print_max = 4L) set.seed(42) library(DT) library(tidygeocoder) library(gt) library(dplyr)
The supported geocoding services are shown in the table below. The method
is used to select the geocoding service in tidygeocoder functions such as geo()
and reverse_geo()
. The usage rate limitations are listed for the free tier of the service when applicable and many services have faster rates available with paid plans.
Also note that there are many other considerations when selecting a geocoding service such as if the service uses open source data with permissive licensing, how the service uses or stores your data, and if there are restrictions on how you can use the data provided by the service. Refer to each service's documentation for details.
library(dplyr) check_mark <- "\U2705" #unicode character for heavy white check mark geocoder_summary_table <- tidygeocoder::api_info_reference %>% mutate( service = paste0( '[', method_display_name, '](', site_url, ')' ), batch_geocoding = ifelse(method %in% names(tidygeocoder:::batch_func_map), check_mark, ''), api_key_required = ifelse(method %in% tidygeocoder::api_key_reference[['method']], check_mark, ''), api_documentation = paste0( '[docs](', api_documentation_url, ')' ) ) %>% left_join(tidygeocoder::min_time_reference %>% select(method, description), by = 'method') %>% select(service, method, api_key_required, batch_geocoding, usage_limitations = description, api_documentation) %>% mutate(across(method, function(x) stringr::str_c('`', x, '`'))) %>% # format method column tidyr::replace_na(list(usage_limitations = '')) # Format column names colnames(geocoder_summary_table) <- colnames(geocoder_summary_table) %>% stringr::str_replace_all('_', ' ') %>% stringr::str_to_title() %>% stringr::str_replace_all('Api', 'API') geocoder_summary_table %>% knitr::kable()
Highlights:
Due diligence must be exercised when geocoding sensitive data as tidygeocoder utilizes third party web services to perform geocoding. Within the context of healthcare, using patient or study subject address data with a third party geocoding service can risk violating privacy rules for International Review Boards (IRBs) and HIPAA.
Further details on possible risk are described here. Refer to the documentation on your selected geocoding service (see links above) for information on how your data will be utilized and stored.
Some options you could consider if the privacy of your data is a concern:
api_options=list(geocodio_hipaa=TRUE)
parameter.method="osm"
) can be installed and hosted locally so that data does not leave your local network. See the Nominatim website for installation instructions. You can use a locally hosted Nominatim service with tidygeocoder by specifying its address with the api_url
parameter.See the geo() or reverse_geo() documentation pages for more documentation on the parameters mentioned above.
types
parameter to be set if limit > 1
. See #104.limit
parameter (#106).outFields
parameter which specifies which fields are to be returned. As of tidygeocoder v1.0.6 this is set to *
(all fields). To return only default fields use the following parameter in your query: custom_query = list(outFields='')
. See #177 for more details.custom_query
parameter:tidygeocoder::geo(address = "New York, USA", method = "arcgis", custom_query = list(token = "<API_KEY>"))
The api_parameter_reference
maps the API parameters for each geocoding service to a common set of "generic" parameters. The generic_name
below is the generic parameter name while the api_name
is the parameter name for the specified geocoding service (method
). Refer to ?api_parameter_reference
for more details.
api_parameter_reference %>% mutate(across(c(method, generic_name, api_name), as.factor)) %>% datatable(filter = 'top', rownames = FALSE, options = list( lengthMenu = c(5, 10, 15, 20, nrow(.)), pageLength = 10, autoWidth = TRUE) )
API keys are retrieved from environmental variables. The name of the environmental variable used for each service is stored in the api_key_reference
dataset. See ?api_key_reference
.
api_key_reference %>% gt() %>% opt_table_outline() %>% opt_table_lines() %>% tab_options(column_labels.font.weight = 'bold')
The minimum time (in seconds) required per query to comply with the usage limitations policies of each geocoding service is stored in the min_time_reference
dataset. See ?min_time_reference
.
min_time_reference %>% gt() %>% opt_table_outline() %>% opt_table_lines() %>% tab_options(column_labels.font.weight = 'bold')
Links to the usage policies for each geocoding service:
cat(tidygeocoder:::get_api_usage_bullets(), sep = '\n')
The maximum number of inputs (geographic coordinates or addresses) per batch query for each geocoding service is stored in the batch_limit_reference
dataset. See ?batch_limit_reference
.
batch_limit_reference %>% gt() %>% fmt_number(columns = 'batch_limit', decimals = 0) %>% opt_table_outline() %>% opt_table_lines() %>% tab_options(column_labels.font.weight = 'bold')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.