knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo = T, message = F ) library(knitr) library(kableExtra) library(magrittr) #library(httptest) #root <- klassR:::GetBaseUrl() #set_redactor(function (response) { # response %>% # gsub_response(root, "", fixed=TRUE) #}) #set_requester(function (request) { # request %>% # gsub_request(root, "", fixed=TRUE) #}) #start_vignette("klassR-vignette")
Do you have a Norwegian data set with codes for Standard Industrial Classification that you want to find out what they mean? Or data with Norwegian municipality numbers and no names? Or perhaps you want to convert English standard occupations into Ny Norsk for a figure. These are tasks which the R package klassR can help you with.
Statistics Norway's KLASS is a central database of classifications and code lists. An API makes it easy to fetch these standards in different computing environments. klassR provides an easy interface to fetch and apply these in R.
For Statistic Norway employees, the package is installed on most of our platforms. For others, it can be installed from CRAN with:
install.packages("klassR")
CRAN is R's central repository for thousands of useful packages. More information on the requirements for klassR can be found on CRAN
To use the function in klassR the package must be called each time a new R session is started. This can be done using:
library(klassR)
To fetch a classification from KLASS you need the unique classification number. This can be found in the URL of the KLASS website or you can search for it in R using one of the following functions.
The function ListKlass
will fetch a list of all classifications. It returns the classification name (klass_name
), number (klass_nr
) and the classification family it belongs to (klass_family
). The classification type (klass_type
) is also shown which indicates whether it is a classification or code list.
ListKlass()
all <- ListKlass() row.names(all) <- NULL knitr::kable(head(all))
Code lists are classifications that used for national and internal (Statistics Norway) publications. These can be included in the list using the codelist
parameter
ListKlass(codelists = TRUE)
ck <- ListKlass(codelists = TRUE) row.names(ck) <- NULL kable(head(ck), align = "l")%>% kable_styling(full_width = T)
You can also search for a classification by a keyword using the SearchKlass
function. The first parameter here is the query to search for.
SearchKlass(query = "ARENA")
kable(SearchKlass(query = "ARENA"))
Again, to include code lists in the search this should be specified
SearchKlass(query = "ARENA", codelists = TRUE)
kable(SearchKlass(query = "ARENA", codelists = TRUE))
Sometimes a classification or code list will appear several times. This is due to that it occurs several times in different langauges in the database.
To fetch a complete classification, use the GetKlass
function together with the unique identifier. For example, to fetch the Standard Industrial Classifications (KLASS number 6) we run:
industry <- GetKlass(6) head(industry)
kable(head(GetKlass(6)))
Classifications are often organised in a heirachical way. In the example above, the Standard Industrial Classifications have different values for level. To fetch a specific level, use the output_level
parameter. For example, to fetch only the top level Standard Industrial Classification codes we use:
industry <- GetKlass(6, output_level = 1) head(industry)
kable(head(GetKlass(6, output_level = 1)))
In the above examples we have seen that the names are returned in Norwegian (Bokmål). However, many of the classification in KLASS are in multiple languages. The output language can be specified as Bokmål ("nb"), Nynorsk ("nn") or English ("en") using the language
parameter. Note: all 3 languages are not available for all classifcations.
industry <- GetKlass(6, output_level = 1, language = "en") head(industry)
kable(head(GetKlass(6, output_level = 1, language = "en")))
The standard output style is 'long' where all levels of classifications are listed down. An alternative format can be chosen using the parameter output_style='wide'
. This will give only one row per detailed classification with the codes and names of the higher/broader levels given as variables.
industry <- GetKlass(6, output_style = "wide", language = "en") head(industry, 2)
kable(head(GetKlass(6, output_style = "wide", language = "en"),2))
Some classifications have additional notes that can be fetched with the classification. These can be included in the data using the option notes = T
.
industry <- GetKlass(6, notes=T) head(industry, 2)
kable(head(GetKlass(6, notes=T), 2))
If you have a data set and want to apply a classification to a variable this is possible to do with ApplyKlass
. This can be used to get the name of a variable which is in code form for example.
There is a built in test data set in klassR called klassdata
. It contains fictitious persons with sex, education level, municipality numbers, industry classification for workplace and occupation.
data(klassdata) head(klassdata)
data(klassdata) kable(head(klassdata))
We can use ApplyKlass
to create a variable for the municipality names (classification number 131) for the persons based on the codes. We specify the vector of codes as the first parameter followed by the unique classification number.
klassdata$kommune_names <- ApplyKlass(klassdata$kommune, klass = 131) head(klassdata)
klassdata$kommune_names <- ApplyKlass(klassdata$kommune, klass = 131, date="2016-01-01") kable(head(klassdata))
Again, the language
and output_level
can be specified.
Classifications will often change over time. The KLASS database considers this and older classifications can be fetched using the date
parameter.
Fetching or using a classification at a specific time point can be done using the date
parameter and specifying the date for which the version of classification applies. The date format should be in the form "yyyy-mm-dd", for example "2022-05-27" for the 27th May, 2022.
There have been many changes to the regions in Norway (classification number 106) over the past few years. We can see this by fetching the classifications for these at different times
GetKlass(106, date = "2019-01-01")
kable(GetKlass(106, date = "2019-01-01"))
GetKlass(106, date = "2020-01-01")
kable(GetKlass(106, date = "2020-01-01"))
Sometime it may be useful to fetch all codes over a period of time. We can do this by specifing two dates as a vector in the date
paramter.
The following code fetched Norwegian regional codes between 1st January 2019 to the 1st January 2020. There are 26 different codes that show both old and newer names.
GetKlass(106, date = c("2019-01-01", "2020-01-01"))
kable(GetKlass(106, date = c("2018-01-01", "2020-01-01")))
To fetch only the changes in a time period rather than all codes we can specify correspond=TRUE
allong with the time interval we are interested in.
GetKlass(106, date = c("2020-01-01", "2019-01-01"), correspond = TRUE)
kable(GetKlass(106, date = c("2020-01-01", "2019-01-01"), correspond = TRUE))
The table returned is a correspondents in codes and/or names in the time interval specified. The sourceCode
and sourceName
refer to the original name and coding. The targetCode
and targetName
refer to the newer code and name. Notice there is not a simple 1:1 correspondence between all of the regions. Here the municipality number would be needed to map the changes more accurately.
Classification that are valid in the future are also included in KLASS. They can be fetched out by specifying the future date. A message will be shown to indicate that this is a future classification. No additional parameters need to be specified.
In addition to small changes in time, some classifications will change completely and a correspondence table is then defined within the KLASS database. These can be fetched or applied using GetKlass
and ApplyKlass
functions together with the correspond
parameter which should give the unique classification number to convert into.
To fetch a correspondence table between municipality codes (131) and greater regional codes (106) we can run:
GetKlass(131, correspond = 106, date = "2020-01-01")
tt <- GetKlass(106, correspond = 131, date = "2020-01-01") navn <- names(tt) tt <- tt[, c(3,4,1,2)] names(tt) <- navn kable(head(tt))
We can apply this correspondence between municipality and region in our example data set using ApplyKlass
.
klassdata$region <- ApplyKlass(klassdata$kommune, klass = 131, correspond = 106, date = "2016-01-01") klassdata
tt <- GetKlass(106, correspond = 131, date = "2016-01-01") navn <- names(tt) tt <- tt[, c(3,4,1,2)] names(tt) <- navn m <- match(klassdata$kommune, tt$sourceCode) klassdata$region <- tt$targetName[m] kable(head(klassdata))
It is also possible to fetch a variant of a classification. You need to provide both the classification number and the variant number.
GetKlass(klass = 6, variant = 1616, date = "2021-01-02")
kable(head(GetKlass(klass = 6, variant = 1616, date = "2021-01-02")))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.