knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  echo = T, 
  message = F
)
library(knitr)
library(kableExtra)
library(magrittr)
#library(httptest)

#root <- klassR:::GetBaseUrl()
#set_redactor(function (response) {
#    response %>%
#        gsub_response(root, "", fixed=TRUE)
#})

#set_requester(function (request) {
#    request %>%
#        gsub_request(root, "", fixed=TRUE)
#})

#start_vignette("klassR-vignette")

1. Introduction

Do you have a Norwegian data set with codes for Standard Industrial Classification that you want to find out what they mean? Or data with Norwegian municipality numbers and no names? Or perhaps you want to convert English standard occupations into Ny Norsk for a figure. These are tasks which the R package klassR can help you with.

Statistics Norway's KLASS is a central database of classifications and code lists. An API makes it easy to fetch these standards in different computing environments. klassR provides an easy interface to fetch and apply these in R.

For Statistic Norway employees, the package is installed on most of our platforms. For others, it can be installed from CRAN with:

install.packages("klassR")

CRAN is R's central repository for thousands of useful packages. More information on the requirements for klassR can be found on CRAN

To use the function in klassR the package must be called each time a new R session is started. This can be done using:

library(klassR)

2. Search for a classification

To fetch a classification from KLASS you need the unique classification number. This can be found in the URL of the KLASS website or you can search for it in R using one of the following functions.

List all classifications

The function ListKlass will fetch a list of all classifications. It returns the classification name (klass_name), number (klass_nr) and the classification family it belongs to (klass_family). The classification type (klass_type) is also shown which indicates whether it is a classification or code list.

ListKlass()
all <- ListKlass()
row.names(all) <- NULL
knitr::kable(head(all))

Code lists are classifications that used for national and internal (Statistics Norway) publications. These can be included in the list using the codelist parameter

ListKlass(codelists = TRUE)
ck <- ListKlass(codelists = TRUE)
row.names(ck) <- NULL
kable(head(ck), align = "l")%>%
  kable_styling(full_width = T)

Search for a classification using a keyword

You can also search for a classification by a keyword using the SearchKlass function. The first parameter here is the query to search for.

SearchKlass(query = "ARENA")
kable(SearchKlass(query = "ARENA"))

Again, to include code lists in the search this should be specified

SearchKlass(query = "ARENA", codelists = TRUE)
kable(SearchKlass(query = "ARENA", codelists = TRUE))

Sometimes a classification or code list will appear several times. This is due to that it occurs several times in different langauges in the database.

3. Fetch a classification

To fetch a complete classification, use the GetKlass function together with the unique identifier. For example, to fetch the Standard Industrial Classifications (KLASS number 6) we run:

industry <- GetKlass(6)
head(industry)
kable(head(GetKlass(6)))

Level

Classifications are often organised in a heirachical way. In the example above, the Standard Industrial Classifications have different values for level. To fetch a specific level, use the output_level parameter. For example, to fetch only the top level Standard Industrial Classification codes we use:

industry <- GetKlass(6, output_level = 1)
head(industry)
kable(head(GetKlass(6, output_level = 1)))

Language

In the above examples we have seen that the names are returned in Norwegian (Bokmål). However, many of the classification in KLASS are in multiple languages. The output language can be specified as Bokmål ("nb"), Nynorsk ("nn") or English ("en") using the language parameter. Note: all 3 languages are not available for all classifcations.

industry <- GetKlass(6, output_level = 1, language = "en")
head(industry)
kable(head(GetKlass(6, output_level = 1, language = "en")))

Output format

The standard output style is 'long' where all levels of classifications are listed down. An alternative format can be chosen using the parameter output_style='wide'. This will give only one row per detailed classification with the codes and names of the higher/broader levels given as variables.

industry <- GetKlass(6, output_style = "wide", language = "en")
head(industry, 2)
kable(head(GetKlass(6, output_style = "wide", language = "en"),2))

Notes

Some classifications have additional notes that can be fetched with the classification. These can be included in the data using the option notes = T.

industry <- GetKlass(6, notes=T)
head(industry, 2)
kable(head(GetKlass(6, notes=T), 2))

4. Applying a classification

If you have a data set and want to apply a classification to a variable this is possible to do with ApplyKlass. This can be used to get the name of a variable which is in code form for example.

There is a built in test data set in klassR called klassdata. It contains fictitious persons with sex, education level, municipality numbers, industry classification for workplace and occupation.

data(klassdata)
head(klassdata)
data(klassdata)
kable(head(klassdata))

We can use ApplyKlass to create a variable for the municipality names (classification number 131) for the persons based on the codes. We specify the vector of codes as the first parameter followed by the unique classification number.

klassdata$kommune_names <- ApplyKlass(klassdata$kommune, 
                                      klass = 131)
head(klassdata)
klassdata$kommune_names <- ApplyKlass(klassdata$kommune, 
                                      klass = 131,
                                      date="2016-01-01")
kable(head(klassdata))

Again, the language and output_level can be specified.

5. Working with dates

Classifications will often change over time. The KLASS database considers this and older classifications can be fetched using the date parameter.

Specify a specific date

Fetching or using a classification at a specific time point can be done using the date parameter and specifying the date for which the version of classification applies. The date format should be in the form "yyyy-mm-dd", for example "2022-05-27" for the 27th May, 2022.

There have been many changes to the regions in Norway (classification number 106) over the past few years. We can see this by fetching the classifications for these at different times

GetKlass(106, date = "2019-01-01")
kable(GetKlass(106, date = "2019-01-01"))
GetKlass(106, date = "2020-01-01")
kable(GetKlass(106, date = "2020-01-01"))

Time intervals

Sometime it may be useful to fetch all codes over a period of time. We can do this by specifing two dates as a vector in the date paramter.

The following code fetched Norwegian regional codes between 1st January 2019 to the 1st January 2020. There are 26 different codes that show both old and newer names.

GetKlass(106, date = c("2019-01-01", "2020-01-01"))
kable(GetKlass(106, date = c("2018-01-01", "2020-01-01")))

Changes in time

To fetch only the changes in a time period rather than all codes we can specify correspond=TRUE allong with the time interval we are interested in.

GetKlass(106, 
         date = c("2020-01-01", "2019-01-01"), 
         correspond = TRUE)
kable(GetKlass(106, date = c("2020-01-01", "2019-01-01"), 
                    correspond = TRUE))

The table returned is a correspondents in codes and/or names in the time interval specified. The sourceCode and sourceName refer to the original name and coding. The targetCode and targetName refer to the newer code and name. Notice there is not a simple 1:1 correspondence between all of the regions. Here the municipality number would be needed to map the changes more accurately.

Future classification

Classification that are valid in the future are also included in KLASS. They can be fetched out by specifying the future date. A message will be shown to indicate that this is a future classification. No additional parameters need to be specified.

6. Correspondence tables

In addition to small changes in time, some classifications will change completely and a correspondence table is then defined within the KLASS database. These can be fetched or applied using GetKlass and ApplyKlass functions together with the correspond parameter which should give the unique classification number to convert into.

To fetch a correspondence table between municipality codes (131) and greater regional codes (106) we can run:

GetKlass(131, correspond = 106, date = "2020-01-01")
tt <- GetKlass(106, correspond = 131, date = "2020-01-01")
navn <- names(tt)
tt <- tt[, c(3,4,1,2)]
names(tt) <- navn
kable(head(tt))

We can apply this correspondence between municipality and region in our example data set using ApplyKlass.

klassdata$region <- ApplyKlass(klassdata$kommune, 
                               klass = 131,
                               correspond = 106,
                               date = "2016-01-01")
klassdata
tt <- GetKlass(106, correspond = 131, date = "2016-01-01")
navn <- names(tt)
tt <- tt[, c(3,4,1,2)]
names(tt) <- navn

m <- match(klassdata$kommune, tt$sourceCode)

klassdata$region <- tt$targetName[m] 
kable(head(klassdata))

7. Variants

It is also possible to fetch a variant of a classification. You need to provide both the classification number and the variant number.

GetKlass(klass = 6, variant = 1616, date = "2021-01-02")
kable(head(GetKlass(klass = 6, variant = 1616, date = "2021-01-02")))


statisticsnorway/klassR documentation built on Feb. 4, 2024, 4:09 a.m.