knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
onetmappinguk
packageThis package is designed to automate the process of mapping information about occupations contained in the US Occupational Information Network (O-NET) onto the UK standard occupational classification (SOC) 2010 edition. This vignette outlines the functions available in the package and how they can be used to map O-NET information onto UK SOC codes to create panel datasets of skills, abilities, and other occupational information.
The functions contained in this package were originally produced for the paper by Dickerson and Morris (2019) - The Changing Demand for Skills in the UK. Further details about the O-NET mapping process, along with an application to estimating the wage returns to skills in the UK, can be found in the paper. The paper is available at http://cver.lse.ac.uk/textonly/cver/pubs/cverdp020.pdf.
Firstly, load the package and other required packages to fun the functions.
library(onetmappinguk) library(dplyr) library(readxl) library(data.table)
There are six main phases to the processing of the data:
These phases, and implementation using onetmappinguk
are discussed in detail below.
The O-NET data is accessible for download from https://www.onetcenter.org/db_releases.html. As of 30th April 2020, versions 6.0, 8.0, 10.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, and 24.0 are included in the package. One version each is used to represent one year, spanning the period 2004 to 2019. Currently (as of version 0.2.0) the package contains functions to read in the data from four of the O-NET domains; skills, abilities, work activities, and work contexts.
skills.data <- onet_skills_read() abilities.data <- onet_abilities_read() activities.data <- onet_activities_read() contexts.data <- onet_contexts_read()
Mapping to the UK involves a many-to-one match between US occupations and UK occupations (over 900 data-level O-NET occupations relative to 369 UK 4-digit SOC occupations), The mapping involves taking an average of O-NET measures across all US occupations which map to a particular UK occupation. These averages need to be weighted by employment, however O-NET do not provide employment directly according to the O-NET SOC classification.
Phase 2 therefore takes employment from the US Occupational Employment Statistics (OES) National tables, available on a yearly basis for download at https://www.bls.gov/oes/tables.htm. The regular US SOC-2010 closely resembles that of the O-NET SOC-2010, and so employment figures for years which use the older SOC-2000 classification are first mapped onto SOC-2010, using a crosswalk also available from the Bureau of Labor Statistics (BLS) at https://www.bls.gov/soc/soccrosswalks.htm. Many of the occupations remain unchanged and therefore there is a direct one to one mapping. Where one SOC-2000 occupation maps to more than one SOC-2010 occupation, the employment is assumed to be equally distributed among each recipient SOC-2010 occupation.
With employment for all years converted to the SOC-2010 classification, employment can be converted to the O-NET SOC system using published crosswalks. The crosswalk for SOC-2010 to O-NET SOC-2010 is available at https://www.onetcenter.org/crosswalks.html#soc. Again, the assumption is made that employment in SOC-2010 occupations is equally distributed among corresponding groups of O-NET SOC-2010 occupations.
Combining the output of the first two phases, Phase 3 deals with the issue of the period of data covering four distinct standard occupational classifications - O-NET SOC 2000, 2006, 2009, and 2010. Functions in Phase 3 uses crosswalk tables between the various SOC vintages and employment totals constructed in Phase 2 as weights to transform the Phase 1 data so that it is consistent with O-NET SOC-2010 in ever year.
phase4_onet(p3 = NULL)
Phase 4 takes the O-NET data panels produced in Phase 3 and deals with the issue that some occupations are rated by job incumbets and some by occupational experts/job analysts. As job incumbents may systematically produce different ratings, the data is "rescaled" in order to remove any variation in the data due to the "incumbency effect". This rescaling is performed by the estimation of panel regressions of the form:
$$y_{jt} = \beta INCUMBENT_{jt} + \tau_t + \mu_t + \epsilon_jt $$
In the equation, $y_{jt}$ is the O-NET variable (e.g. skill, importance measure, for skill 35 - writing) for occupation $j$ at time $t$. $INCUMBENT$ is a dummy variable equal to one if the variable was rated by a job incumbent, and zero otherwise. Note that due to the reweighting exercise in phase 3, where multiple occupations from a previous SOC map to O-NET SOC-2010, some of these occupations may be rated by incumbents and some not. Consequently, the variable can plausibly take on values between 0 and 1. In phase 4, any value greater than 0 is recoded to 1, so the variable is an indicator that the O-NET variable is at least partially rated by job incumbents.
The rescaled version, $y^*_{jt}$ is then simply:
$$y^*{jt} = y{jt} - \beta INCUMBENT_{jt}$$
The individual functions which construct the incumbent dummy, run the regressions, and rescale the variables included in the phase 4 wrapper function are:
rescale_activities(p3 = NULL) rescale_skills(p3 = NULL) rescale_abilities(p3 = NULL)
Each function takes the same, single argument - the list object created by phase3_onet()
.
The function constructs three summary indices of skills from the raw data items:
Two methods of aggregation are used. Firstly, a simple mean of the respective elements from the importance dimension are taken. The second aggregation is summing up the constituent elements of each skill where each element is Cobb-Douglas weighted mean of importance and levels information. The default weight is 2/3 for importance and 1/3 for levels, though the importance weight can be varied by the function and the levels weight automatically adjusted such that the weights sum to one.
Firpo, Fortin, and Lemiuex (2011) index of offshorability
Use work context as well as the work activities information used by Jensen and Kletzer (2010). Activity importance and level measures are combined using Cobb-Douglas weights of 2/3 and 1/3 respectively. Contexts are simply added.
Information Content
Automation/Routinisation
Face-to-face Contact
Decision Making
On-site Nature of Job
Once constructed, each measure is normalised to a maximum value of 1.
library(onetmappinguk) library(tidyverse) library(ggthemes) onet.data <- onet_function() final.data <- onet.data$p6 library(tidyverse) library(ggthemes) final.data <- final.data %>% mutate(smI_Analytical = (mI_Analytical - mean(mI_Analytical,na.rm = TRUE))/sd(mI_Analytical,na.rm = TRUE)) %>% mutate(smI_Interpersonal = (mI_Interpersonal - mean(mI_Interpersonal,na.rm = TRUE))/sd(mI_Interpersonal,na.rm = TRUE)) %>% mutate(smI_Technical = (mI_Technical - mean(mI_Technical,na.rm = TRUE))/sd(mI_Technical,na.rm = TRUE)) %>% group_by(year) %>% mutate(mean_A = mean(smI_Analytical,na.rm = TRUE)) %>% mutate(mean_I = mean(smI_Interpersonal,na.rm = TRUE)) %>% mutate(mean_T = mean(smI_Technical,na.rm = TRUE)) %>% select(c(year,mean_A,mean_T,mean_I)) %>% distinct() final.data <- as.data.frame(final.data) skills_plot <- reshape(final.data, idvar = "year", direction = "long", varying = c("mean_A","mean_T","mean_I"), v.names = "mean", timevar = "skill") skills_plot$skill <- factor(skills_plot$skill,levels = c(1:3), labels = c("Analytical","Technical","Interpersonal")) ggplot(data = skills_plot) + aes(x = year,y = mean, color = skill, shape = skill) + geom_line() + geom_point(size=2.5) + geom_hline(yintercept = 0, color = "black") + scale_color_colorblind() + labs(title = "Change in Mean Standardised Skills over Time", x = "Year", y = "Standardised Skill", color = "Skill:", shape = "Skill:") + theme_minimal() + scale_x_continuous(breaks = seq(2004,2019,1),minor_breaks = NULL) + scale_y_continuous(breaks = seq(-0.8,0.4,0.1),minor_breaks = NULL) + theme(legend.position = "bottom", legend.title = element_text(size=14), legend.text = element_text(size=14))
Dickerson, A and Morris, D (2019) "The Changing Demand for Skills in the UK" CVER Discussion Paper Series, DP020
Firpo, S. Fortin, N. and Lemieux, T (2011) "Occupational Tasks and Changes in the Wage Structure" IZA Discussion Paper 5542. IZA, Bonn
Jensen, J and Kletzer, L (2010) "Measuring Tradable Services and the Task Content of Offshorable Services Jobs" in; Labor in the New Economy, pp 309-335
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.