get_data-deprecated: Get data from Statistics Netherlands (CBS)

get_data-deprecatedR Documentation

Get data from Statistics Netherlands (CBS)

Description

This method is deprecated in favor of cbs_get_data()

Usage

get_data(
  id,
  ...,
  recode = TRUE,
  use_column_title = recode,
  dir = tempdir(),
  base_url = getOption("cbsodataR.base_url", BASE_URL)
)

Arguments

id

Identifier of table, can be found in cbs_get_datasets()

...

optional filter statements, see details.

recode

recodes all codes in the code columns with their Title as found in the metadata

use_column_title

not used.

dir

Directory where the table should be downloaded. Defaults to temporary directory

base_url

optionally specify a different server. Useful for third party data services implementing the same protocol, see details.

Details

To reduce the download time, optionaly the data can be filtered on category values: for large tables (> 100k records) this is a wise thing to do.

The filter is specified with (see examples below):

  • ⁠<column_name> = <values>⁠ in which ⁠<values>⁠ is a character vector. Rows with values that are not part of the character vector are not returned. Note that the values have to be values from the ⁠$Key⁠ column of the corresponding meta data. These may contain trailing spaces...

  • ⁠<column_name> = has_substring(x)⁠ in which x is a character vector. Rows with values that do not have a substring that is in x are not returned. Useful substrings are "JJ", "KW", "MM" for Periods (years, quarters, months) and "PV", "CR" and "GM" for Regions (provinces, corops, municipalities).

  • ⁠<column_name> = eq(<values>) | has_substring(x)⁠, which combines the two statements above.

By default the columns will be converted to their type (typed=TRUE). CBS uses multiple types of missing (unknown, surpressed, not measured, missing): users wanting all these nuances can use typed=FALSE which results in character columns.

Value

data.frame with the requested data. Note that a csv copy of the data is stored in dir.

Specify different server

Besides the official CBS data, there are also third party and preview dataservices implementing the same protocol. The base_url parameter allows to specify a different server. The base_url can either be specified explicitly or set globally with with options(cbsodataR.base_url = "http://example.com"). Some further tweaking may be necessary for third party services, a download url is constructed using: either with:

  • ⁠<base_url>/<BULK>/<id>/...⁠ for data

  • ⁠<base_url>/<API>/<id>/?$format=json⁠ for metadata

Default values for BASEURL, BULK and API are set in the package options, but can be changed with:

options(
 cbsodataR.base_url = "https://opendata.cbs.nl",
 cbsodataR.BULK = "ODataFeed/odata",
 cbsodataR.API = "ODataAPI/odata"
)

which are the default values set in the package.

Copyright use

The content of CBS opendata is subject to Creative Commons Attribution (CC BY 4.0). This means that the re-use of the content is permitted, provided Statistics Netherlands is cited as the source. For more information see: https://www.cbs.nl/en-gb/about-us/website/copyright

Note

All data are downloaded using cbs_download_table()

See Also

cbs_get_meta(), cbs_download_data()

Other data retrieval: cbs_add_date_column(), cbs_add_label_columns(), cbs_add_unit_column(), cbs_download_data(), cbs_extract_table_id(), cbs_get_data_from_link()

Other query: eq(), has_substring()

Examples

## Not run: 
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = "2000MM03"     # March 2000
            , CPI     = "000000"       # Category code for total 
            )

# useful substrings:
## Periods: "JJ": years, "KW": quarters, "MM", months
## Regions: "NL", "PV": provinces, "GM": municipalities
  
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = has_substring("JJ")     # all years
            , CPI     = "000000"       # Category code for total 
            )

cbs_get_data( id      = "7196ENG"      # table id
            , Periods = c("2000MM03","2001MM12")     # March 2000 and Dec 2001
            , CPI     = "000000"       # Category code for total 
            )

# combine either this
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = has_substring("JJ") | "2000MM01" # all years and Jan 2001
            , CPI     = "000000"       # Category code for total 
            )

# or this: note the "eq" function
cbs_get_data( id      = "7196ENG"      # table id
            , Periods = eq("2000MM01") | has_substring("JJ") # Jan 2000 and all years
            , CPI     = "000000"       # Category code for total 
            )

## End(Not run)

edwindj/cbsodataR documentation built on Oct. 17, 2024, 3:35 a.m.