cbs4_get_data: Get data from CBS

View source: R/cbs4_get_data.R

cbs4_get_dataR Documentation

Get data from CBS

Description

Get data from table id. The data of a CBS opendata table is in so-called wide format. Each Measure has its own column.

Usage

cbs4_get_data(
  id,
  catalog = "CBS",
  ...,
  query = NULL,
  name_measure_columns = TRUE,
  show_progress = interactive() && !verbose,
  download_dir = file.path(tempdir(), id),
  verbose = getOption("cbsodata4.verbose", FALSE),
  sep = ",",
  as.data.table = FALSE,
  base_url = getOption("cbsodata4.base_url", BASEURL4)
)

Arguments

id

Identifier of the Opendata table. Can be retrieved with cbs4_get_datasets()

catalog

Catalog in which the dataset is to be found.

...

optional selections on data, passed through to cbs4_download. See examples

query

optional query in odata4 syntax (overwrites any specification in ...)

name_measure_columns

logical if TRUE the Title of the measure will be set as name column.

show_progress

if TRUE shows progress of data download, can't be used together with verbose.

download_dir

directory in which the data and metadata is downloaded. By default this is temporary directory, but can be set manually

verbose

if TRUE prints the steps taken to retrieve the data.

sep

separator to be used to download the data.

as.data.table

logical, should the result be of type data.table?

base_url

Possible other url which implements same protocol.

Details

The returned data.frame() has the following columns:

  • For each dimension a separate column with category identifiers. Category labels can be added with cbs4_add_label_columns() or found in cbs4_get_metadata(). Date columns can be added with cbs4_add_date_column().

  • For each Measure / Topic a separate column with values. Units can be found in cbs4_get_metadata() (MeasureCodes).

For a long format instead of wide format see cbs4_get_observations() which has one Measure column and a Value column.

Value

a data.frame() or data.table() object. See details.

See Also

cbs4_get_metadata()

Other data-download: cbs4_download(), cbs4_get_observations()

Examples

if (interactive()){

  # filter on Perioden (see meta$PeriodenCodes)
  cbs4_get_data("84287NED"
               , Perioden = "2019MM12" # december 2019
               )

  # filter on multiple Perioden (see meta$PeriodenCodes)
  cbs4_get_data("84287NED"
               , Perioden = c("2019MM12", "2020MM01") # december 2019, january 2020
               )

  # to filter on a dimension just add the filter to the query

  # filter on Perioden (see meta$PeriodenCodes)
  cbs4_get_data("84287NED"
               , Perioden = "2019MM12" # december 2019
               , BedrijfstakkenBranchesSBI2008 = "T001081"
               )


  # filter on Perioden with contains
  cbs4_get_data("84287NED"
                , Perioden = contains("2020")
                , BedrijfstakkenBranchesSBI2008 = "T001081"
  )

  # filter on Perioden with multiple contains
  cbs4_get_data("84287NED"
                , Perioden = contains(c("2019MM1", "2020"))
                , BedrijfstakkenBranchesSBI2008 = "T001081"
  )

  # filter on Perioden with contains or = "2019MM12
  cbs4_get_data("84287NED"
                , Perioden = contains("2020") | "2019MM12"
                , BedrijfstakkenBranchesSBI2008 = "T001081"
  )

  # This all works on observations too
  cbs4_get_observations( id        = "80784ned"     # table id
                       , Perioden  = "2019JJ00"     # Year 2019
                       , Geslacht  = "1100"         # code for total gender
                       , RegioS    = contains("PV") # provinces
                       , Measure   = "M003371_2"    # topic selection
                       )

  # supply your own odata 4 query
  cbs4_get_data("84287NED", query = "$filter=Perioden eq '2019MM12'")

  # an odata 4 query will overrule other filter statements
  cbs4_get_data("84287NED"
               , Perioden = "2018MM12"
               , query = "$filter=Perioden eq '2019MM12'"
               )

  # With query argument an odata4 expression with other (filter) functions can be used
  cbs4_get_observations(
    id     = "80784ned"    # table id
    ,query = paste0(       # odata4 query
       "$skip=4",          # skip the first 4 rows of the filtered result
       "&$top=20",         # then slice the first 20 rows of the filtered result
       "&$select=Measure,Geslacht,Perioden,RegioS,Value", # omit the Id and ValueAttribute fields
       "&$filter=endswith(Measure,'_1')") # filter only Measure ending on '_1'
    )

}

statistiekcbs/cbsccb documentation built on April 8, 2022, 2:38 a.m.