read_dkan: DKAN Datastore API Function

Description Usage Arguments Value Examples

Description

This function provides an interface with the datastore API of any DKAN-based data portal (such as the California Open Data Portal, at: data.ca.gov), to perform queries and download data programattically and in real-time.

Usage

1
2
3
read_dkan(base_URL = "https://data.ca.gov", resource_id, filter_fields = NA,
  filter_values = list(NA), fields = NA, query = NA, sort_field = NA,
  sort_direction = NA)

Arguments

base_URL

The base URL for the data portal (defaults to: https://data.ca.gov)

resource_id

An alphanumeric code representing a particular data resource (e.g., a731c980-9477-4ec7-bcfc-6d0cce00306c). It can be found on the Data API page for the resource (e.g., https://data.ca.gov/node/1801/api).

filter_fields

A list of the fields that will be used as a filter. If filtering on multiple fields, enter the field names using the c() function (e.g., c(field1, field2)).

filter_values

This argument must be entered as a list, where each element of the list corresponds to a given filter_field, and each element can have multiple items (e.g., list(c('Element1_Item1', 'Element1_Item2'), c('Element2_Item1','Element2_Item2')) specifies a 2 element list with 2 items in each element, and each element corresponds to the respective field(s) entered to the filter_fields argument). Note that if a given filter value contains a comma, the filter may not work, and the query field may need to be used instead.

fields

This specifies the fields (i.e., columns) of the dataset to return. If left blank, all of the dataset's fields will be returned.

query

A fulltext search across all fields (i.e, this returns all records where the query text is found in any field).

sort_field

A field to sort (i.e. order) the records returned by the query, in either ascending or descending order.

sort_direction

The method (i.e., direction) for sorting the given sort_field, either ascending or descending. Enter asc (i.e., ascending) or desc (i.e., descending).

Value

This function returns a data frame with the all records for a given resource on a DKAN portal that match the the given filter_fields, filter_values, and/or fields that are passed as arguments to the function.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# This filters for the PWSID value 'CA3010037' and Stage_Invoked value = 'Stage 1", and returns only 4 fields ('Supplier_Name', 'PWSID', '2013_Production_Reported', and 'Stage_Invoked').
# It returns the restuls to a data frame called dkan_data.
dkan_data <- read_dkan(resource_id = 'a731c980-9477-4ec7-bcfc-6d0cce00306c', filter_fields = c('PWSID', 'Stage_Invoked'), filter_values = list(c('CA3010037'), c('Stage 1')), fields = c('Supplier_Name', 'PWSID', '2013_Production_Reported', 'Stage_Invoked'))

# This returns records from the dataset specificed by resource ID *a731c980-9477-4ec7-bcfc-6d0cce00306c* where the text *American Canyon, City of* is found within any field,
# and the records that are returned are sorted on the Reporting_Month field in ascending order.
dkan_data <- read_dkan(resource_id = 'a731c980-9477-4ec7-bcfc-6d0cce00306c', query = 'American Canyon, City of', sort_field = 'Reporting_Month', sort_direction = 'asc')

# This is an example of accessing a data portal other than the California Open Data Portal (in this case, the Oakland Data Catalog, from OpenOakland)
dkan_data <- read_dkan(base_URL = 'http://data.openoakland.org', resource_id = 'aca3da67-a4e2-46a0-8727-1657fcdc0e1d', filter_fields = 'street', filter_values = list(c('HENRY', 'FILBERT', 'MYRTLE')))

CAWaterBoardDataCenter/dkanTools documentation built on May 8, 2019, 11:54 p.m.