source("R/setup.R")$value
if (!sc_key_exists()) knitr::opts_chunk$set(eval = FALSE) show_json <- function(x) { paste( "<details>", "<summary>Show json request</summary>", "", "```r", "x$json", "```", "", x$json$.__enclos_env__$private$json_content %>% paste("```json\n", . , "\n```") %>% knitr::asis_output(), "", "</details>", sep = "\n" ) %>% knitr::asis_output() }
The function sc_table_custom()
allows you to define requests against the /table
endpoint programmatically.
This can be useful to automate the generation of /table
request rather than relying on the GUI to do so.
The function accepts the four arguments.
MEASURE
, STAT_FUNCTION
or COUNT
)FIELD
or VALUESET
)The first part of this Article will showcase how custom tables can be created with a database about tourism. This database will also be used in most other examples of this article.
First, we want to just send the database id to sc_table_custom()
.
This will request only the mandatory fields and default measures for that database.
In case of the tourism database, a table with one single row is returned.
database <- "str:database:detouextregsai" x <- sc_table_custom(database) x$tabulate()
show_json(x)
We see that r format(x$tabulate()[[2]], big.mark = " ")
nights were spent in Austrian tourism establishments in the month of r x$tabulate()[[1]]
.
Now we want to add a classification to the table. This can be done by getting the database schema and showing all classification fields.
tourism <- sc_schema_db(database) (fields <- sc_schema_flatten(tourism, "FIELD"))
If we want to add "Country of origin" we need to include the fourth entry of the id
column in our request.
x <- sc_table_custom(tourism, dimensions = fields$id[4]) x$tabulate()
show_json(x)
Alternatively, we could also pass the schema object for "country of origin".
origin <- tourism$`Other Classifications`$`Country of origin` x <- sc_table_custom(tourism, dimensions = origin)
The dimensions
parameter in sc_schema_custom()
accepts vectors of field ids.
Therefore, we can add the communes easily.
x <- sc_table_custom(tourism, dimensions = fields$id[c(2, 4)]) x$tabulate()
show_json(x)
Currently, the table only returns the default measure for the database which is the number of nights spent. We can add a second measure by again using the database schema and passing a measure id
(measures <- sc_schema_flatten(tourism, "MEASURE"))
We can add both measures to the request by using measures$id
.
Just like the dimensions
parameter, the measures
parameters accepts vectors of resource ids.
x <- sc_table_custom(tourism, measures = measures$id, dimensions = fields$id[c(2, 4)]) x$tabulate()
show_json(x)
We can see in the GUI that "Country of origin" is a hierarchical classification. If we look at the table above, only the top level of the hierarchy (Austria, Germany, other) is used. This can be changed by providing the the value-set that corresponds to the more granular classification of "country of origin"
knitr::include_graphics("img/hierarchical_classification.png")
The different value-sets for "country of origin" can be compared by browsing the database schema.
(valuesets <- tourism$`Other Classifications`$`Country of origin`)
We can see that the two levels of the hierarchy are represented by the two value-sets. The value-set "Herkunftsland" uses 3 classification elements and represents the top level of the hierarchy (Austria, Germany, Other). The value-set "Country of origin" uses 87 (10+8+69) classification elements and is the bottom level of the hierarchy. For classifications with more levels of hierarchies, more value-sets will be present.
We will now use the id for the first value-set in the dimensions
parameter of sc_table_custom
.
x <- sc_table_custom( db = tourism, measures = measures$id, dimensions = valuesets$`Country of origin` ) x$tabulate()
show_json(x)
It is possible to use a mixture of value-sets and fields in the dimensions
parameter.
Instead of Measures and Value-sets, it is also possible to provide counts
in the measure
parameter of sc_table_custom()
.
population <- sc_schema_db("debevstand") (count <- population$`Datensätze/Records`$`F-BEVSTAND`) x <- sc_table_custom(population, count) x$tabulate()
show_json(x)
Data can be filtered on the server side by using the recodes
parameter
of sc_table_custom()
.
This might be more complicated than filtering the data in R but offers some
important advantages.
?sc_rate_limits
),
STATcube also
limits the amount of cells that can be fetched per user.
Filtering data can be useful to preserve this quota.As an example for filtering data, we can request a table from the tourism
database and only select some countries for Country of origin
.
origin <- tourism$`Other Classifications`$`Country of origin`$`Country of origin` month <- tourism$`Mandatory fields`$`Season/Tourism Month`$`Season/Tourism Month` x <- sc_table_custom( db = tourism, measures = measures$id, dimensions = list(month, origin), recodes = sc_recode(origin, list(origin$`Italy <29>`, origin$`Germany <12>`)) ) x$tabulate()
show_json(x)
This table only contains two countries rather than
r length(valuesets[["Country of origin"]])-4
so the amount of cells in the
table is also 40 times less compared to a table that would omit this filter.
Other options from the recodes specification are also available via sc_recode()
.
It is possible to group items and specify recodes for several classifications.
x <- sc_table_custom( db = tourism, measures = measures$id, dimensions = list(month, origin), recodes = c( sc_recode(origin, list( list(origin$`Germany <12>`, origin$`Netherlands <25>`), list(origin$`Italy <29>`, origin$`France (incl.Monaco) <14>`) )), sc_recode(month, list( month$Nov.99, month$Feb.00, month$Apr.09, month$`Jan. 22` )) ) ) x$tabulate()
show_json(x)
This table contains data for two country-groups and two months. In this case, the cell values for Germany and the Netherlands are just added to calculate the entries for Arrivals and Nights spent. However, in other cases STATcube might decide it is more appropriate to use weighted means or other more complicated aggregation methods.
The total
parameter in sc_recode()
can be used to request totals for
classifications. As an example, let's look at the tourism activity in the
capital cities of Austria
destination <- tourism$`Other Classifications`$`Tourism commune [ABO]`$ `Regionale Gliederung (Ebene +1)` x <- sc_table_custom( tourism, measures = measures$id, dimensions = list(month, destination), recodes = c( sc_recode(destination, total = TRUE, list( destination$Wien, destination$`Stadt Salzburg`, destination$Linz)), sc_recode(month, total = FALSE, list(month$Nov.99, month$Apr.09)) ) ) as.data.frame(x)
show_json(x)
We see that there are two rows in the table where Tourism commune is set to "Total". The corresponding values represent the sum of all Arrivals or Nights spent in either of these three cities during that month.
To use a recode that includes several hierarchy levels, the corresponding
FIELD
should be used as the first parameter of sc_recode()
.
For example, a recode with countries and federal states from the "Country of origin"
classification can be defined as follows.
origin1 <- tourism$`Other Classifications`$`Country of origin` origin2 <- origin1$`Country of origin` origin3 <- origin1$`Herkunftsland (Ebene +1)` x <- sc_table_custom( tourism, measures$id, origin1, recodes = sc_recode(origin1, list( origin2$`Vienna <01>`, origin3$Germany, list(origin2$`Bavaria (beg.05/03) <80>`, origin3$`other countries`)) ) ) x$tabulate()
show_json(x)
Since custom tables can become quite complicated, sc_table_custom()
performs
type-checks before sending the request to the API.
If inconsistencies are detected, warnings will be generated.
See ?sc_table_custom
for a comprehensive list of the performed checks.
sc_table_custom(tourism, measures = tourism, dry_run = TRUE)
Advanced example
sc_table_custom("A", measures = "B", dimensions = "C", recodes = sc_recode("D", "E"), dry_run = TRUE)
If dry_run
is set to FALSE
(the default), STATcubeR will send the request
to the API even if inconsistencies are detected.
This will likely lead to an error of the form "expected json but got html".
If you get spurious warnings or have suggestions on how these type-checks might be improved, please issue a feature request to the [STATcubeR bug tracker].
sc_schema()
.
But in case you are not, the r ticle("sc_schema")
contains more information
on how to get metadata from the API.r ticle("sc_data")
showcases different ways to extract data and metadata
from the return value of sc_table_custom()
.Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.