options(width = 999)
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE,
                      cache = FALSE, tidy = FALSE, size = "small")

In this vignette we provide examples of how to use entrydatar to download precise files from the QCEW.
In the second part we give rudimentary documentations of the feature that are of principal interest to work with the data. For an overview of the QCEW, head over to the BLS website

The vignette is organized as:

  1. Downloading data from the QCEW
  2. Documentation
  3. Table of data cuts
  4. What is available
  5. General documentation

Downloading data from the QCEW

To get read to download data, we load some libraries that the package might have forgotten to call (the package works with all tables in a data.table format)

library(dplyr); library(data.table); 
library(entrydatar)

General download

For example if we are interested in downloading aggregate level data we use the cut 10. Please see documentation below for a definition of what is availabe and the mapping of the different cuts of QCEW.

dt_agg <- get_qcew_cut(       
    data_cut = 10,     
    year_start = 1990, year_end =1993,    
    path_data = "~/Downloads/", write = F)
as_tibble(dt_agg)
# A tibble: 16 x 42
   area_fips own_code indus… agglv… size_…  year   qtr discl… qtrly… month… month… month… total_… taxa… qtrl… avg_… lq_d… lq_q… lq_m… lq_m… lq_m… lq_t… lq_t… lq_q… lq_a… oty_… oty_q… oty_… oty_mo… oty_m… oty_mo… oty_m… oty_mo… oty_m… oty_to… oty_t… oty_tax… oty_… oty_qt… oty_… oty_a… oty_…
   <chr>        <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <chr>   <dbl>  <dbl>  <dbl>  <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl> <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>  <dbl>    <dbl> <dbl>   <dbl> <dbl>  <dbl> <dbl>
 1 US000            0   10.0   10.0      0  1990  1.00 ""     6.02e⁶ 1.06e⁸ 1.07e⁸ 1.08e⁸ 6.23e¹¹     0     0   448    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 127332  2.20  2.09e⁶  2.00   2.12e⁶  2.00   2.03e⁶  1.90  3.91e¹⁰  6.70  -3.57e¹¹  -100 -7.56e⁹  -100  20.0   4.70
 2 US000            0   10.0   10.0      0  1990  2.00 ""     6.04e⁶ 1.08e⁸ 1.09e⁸ 1.10e⁸ 6.29e¹¹     0     0   443    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 154120  2.60  1.87e⁶  1.80   2.12e⁶  2.00   1.82e⁶  1.70  3.67e¹⁰  6.20  -1.82e¹¹  -100 -4.01e⁹  -100  19.0   4.50
 3 US000            0   10.0   10.0      0  1990  3.00 ""     6.09e⁶ 1.08e⁸ 1.09e⁸ 1.10e⁸ 6.24e¹¹     0     0   441    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 164809  2.80  1.68e⁶  1.60   1.79e⁶  1.70   1.34e⁶  1.20  3.37e¹⁰  5.70  -1.10e¹¹  -100 -2.47e⁹  -100  18.0   4.30
 4 US000            0   10.0   10.0      0  1990  4.00 ""     6.12e⁶ 1.09e⁸ 1.09e⁸ 1.09e⁸ 6.88e¹¹     0     0   483    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 149102  2.50  1.01e⁶  0.900  6.69e⁵  0.600  2.75e⁵  0.300 3.90e¹⁰  6.00  -8.46e¹⁰  -100 -1.85e⁹  -100  25.0   5.50
 5 US000            0   10.0   10.0      0  1991  1.00 ""     6.30e⁶ 1.05e⁸ 1.05e⁸ 1.06e⁸ 6.37e¹¹     0     0   465    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 284012  4.70 -1.03e⁶ -1.00  -1.51e⁶ -1.40  -1.77e⁶ -1.60  1.40e¹⁰  2.20   0           0  0          0  17.0   3.80
 6 US000            0   10.0   10.0      0  1991  2.00 ""     6.36e⁶ 1.06e⁸ 1.07e⁸ 1.08e⁸ 6.45e¹¹     0     0   463    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 317123  5.30 -1.84e⁶ -1.70  -2.05e⁶ -1.90  -2.24e⁶ -2.00  1.64e¹⁰  2.60   0           0  0          0  20.0   4.50
 7 US000            0   10.0   10.0      0  1991  3.00 ""     6.39e⁶ 1.06e⁸ 1.07e⁸ 1.08e⁸ 6.41e¹¹     0     0   462    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 303568  5.00 -2.13e⁶ -2.00  -1.93e⁶ -1.80  -2.03e⁶ -1.90  1.76e¹⁰  2.80   0           0  0          0  21.0   4.80
 8 US000            0   10.0   10.0      0  1991  4.00 ""     6.43e⁶ 1.08e⁸ 1.08e⁸ 1.08e⁸ 7.05e¹¹     0     0   503    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 317664  5.20 -1.70e⁶ -1.60  -1.61e⁶ -1.50  -1.52e⁶ -1.40  1.72e¹⁰  2.50   0           0  0          0  20.0   4.10
 9 US000            0   10.0   10.0      0  1992  1.00 ""     6.48e⁶ 1.05e⁸ 1.05e⁸ 1.06e⁸ 6.64e¹¹     0     0   486    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 176781  2.80 -5.04e⁵ -0.500 -1.48e⁵ -0.100 -1.06e⁵ -0.100 2.72e¹⁰  4.30   0           0  0          0  21.0   4.50
10 US000            0   10.0   10.0      0  1992  2.00 ""     6.50e⁶ 1.07e⁸ 1.08e⁸ 1.09e⁸ 6.72e¹¹     0     0   479    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 140624  2.20  5.39e⁵  0.500  6.40e⁵  0.600  6.08e⁵  0.600 2.69e¹⁰  4.20   0           0  0          0  16.0   3.50
11 US000            0   10.0   10.0      0  1992  3.00 ""     6.52e⁶ 1.07e⁸ 1.07e⁸ 1.08e⁸ 6.72e¹¹     0     0   482    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 131248  2.10  7.06e⁵  0.700  2.48e⁵  0.200  6.40e⁵  0.600 3.06e¹⁰  4.80   0           0  0          0  20.0   4.30
12 US000            0   10.0   10.0      0  1992  4.00 ""     6.57e⁶ 1.09e⁸ 1.09e⁸ 1.09e⁸ 7.79e¹¹     0     0   550    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 134740  2.10  1.09e⁶  1.00   1.05e⁶  1.00   1.22e⁶  1.10  7.36e¹⁰ 10.4    0           0  0          0  47.0   9.30
13 US000            0   10.0   10.0      0  1993  1.00 ""     6.62e⁶ 1.06e⁸ 1.07e⁸ 1.07e⁸ 6.67e¹¹     0     0   480    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 138525  2.10  1.43e⁶  1.40   1.70e⁶  1.60   1.69e⁶  1.60  2.54e ⁹  0.400  0           0  0          0 - 6.00 -1.20
14 US000            0   10.0   10.0      0  1993  2.00 ""     6.65e⁶ 1.09e⁸ 1.10e⁸ 1.11e⁸ 7.04e¹¹     0     0   494    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 155234  2.40  1.85e⁶  1.70   1.75e⁶  1.60   1.98e⁶  1.80  3.19e¹⁰  4.70   0           0  0          0  15.0   3.10
15 US000            0   10.0   10.0      0  1993  3.00 ""     6.70e⁶ 1.09e⁸ 1.09e⁸ 1.11e⁸ 7.11e¹¹     0     0   499    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 176159  2.70  2.15e⁶  2.00   2.31e⁶  2.20   2.35e⁶  2.20  3.86e¹⁰  5.70   0           0  0          0  17.0   3.50
16 US000            0   10.0   10.0      0  1993  4.00 ""     6.75e⁶ 1.11e⁸ 1.11e⁸ 1.12e⁸ 8.06e¹¹     0     0   556    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA 176528  2.70  2.42e⁶  2.20   2.54e⁶  2.30   2.80e⁶  2.60  2.76e¹⁰  3.50   0           0  0          0   6.00  1.10

Note that the dataset can be large. Downloading most of the industry 3 and 4 digits cuts across counties and MSAs ends up with 57mn rows. Saving it in .rds formats takes around 2gb of memory.

On the other hand for something more precise, say nationally by size cuts at the 6 digits industry level we would call the cut 28:

dt_naics <- get_qcew_cut(        
  data_cut = 28,       
  industry = "naics", 
  year_start = 1990, year_end = 2015,       
  path_data = "~/Downloads/", write = F)
as_tibble(dt_naics)
# A tibble: 208,667 x 47
   area… own_… indus… aggl… size…  year   qtr disc… area… own_… indu… aggl… size… qtrly… month… month… month… total… taxa… qtrl… avg_… lq_d… lq_q… lq_m… lq_m… lq_m… lq_t… lq_t… lq_q… lq_a… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_… oty_…
   <chr> <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
 1 US000  5.00 111110  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 2.16354    353    371   1.57e⁶     0     0   337    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 2 US000  5.00 111110  28.0  2.00  1990  1.00    NA    NA    NA    NA    NA    NA 9.80547    577    640   1.95e⁶     0     0   255    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 3 US000  5.00 111110  28.0  3.00  1990  1.00    NA    NA    NA    NA    NA    NA 4.00439    450    519   1.32e⁶     0     0   216    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 4 US000  5.00 111110  28.0  4.00  1990  1.00    NA    NA    NA    NA    NA    NA 1.40336    336    382   1.01e⁶     0     0   220    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 5 US000  5.00 111120  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 2.6039.0   33.0   39.0 1.30e⁵     0     0   270    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 6 US000  5.00 111120  28.0  2.00  1990  1.00    NA    NA    NA    NA    NA    NA 5.00e⁰   21.0   23.0   28.0 6.87e⁴     0     0   220    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 7 US000  5.00 111130  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 8.2076.0   73.0   85.0 3.28e⁵     0     0   324    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 8 US000  5.00 111140  28.0  1.00  1990  1.00    NA    NA    NA    NA    NA    NA 1.921369   1314   1575   5.09e⁶     0     0   276    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
 9 US000  5.00 111140  28.0  2.00  1990  1.00    NA    NA    NA    NA    NA    NA 8.50374    404    526   1.34e⁶     0     0   237    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
10 US000  5.00 111140  28.0  4.00  1990  1.00    NA    NA    NA    NA    NA    NA 8.00e⁰  193    211    224   7.45e⁵     0     0   274    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0     0
# ... with 208,657 more rows

Faster download with annual frequencies

The BLS provides quarterly as well as annual averages, which are lighter. There is now an option to download directly the annual average using a frequency option. The default is quarterly, so everything should be backward compatible. Let's see first with the quarterly data (warning it is slow):

dt_naics <- get_qcew_cut(     
  data_cut = 16,     
  industry = "naics",  frequency = "quarter",  
  year_start = 2000, year_end = 2005,    
  path_data = "~/Downloads/", write = F)
as_tibble(dt_naics)
# A tibble: 18,028 x 42
   area_fips own_code indus… agglv… size_…  year   qtr discl… qtrly_… month… month… month… total… taxa… qtrl… avg_… lq_d… lq_q… lq_m… lq_m… lq_m… lq_t… lq_t… lq_q… lq_a… oty_… oty_… oty_q… oty_mo… oty_mo… oty_mo… oty_m… oty_mo… oty_m… oty_to… oty_to… oty_… oty_… oty_… oty_… oty_av… oty_av…
   <chr>        <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <chr>    <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>   <dbl>   <dbl>  <dbl>   <dbl>  <dbl>   <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>
 1 US000         1.00   1125   16.0      0  2000  1.00 ""        2.00 7.00e⁰ 7.00e⁰ 7.00e⁰ 5.24e⁴     0     0   576    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    -  1.00 -12.5   -  1.00 -12.5  -  1.00 -12.5  -1.90- 3.50      0     0     0     0   54.0   10.3  
 2 US000         1.00   1125   16.0      0  2000  2.00 ""        2.00 7.00e⁰ 7.00e⁰ 8.00e⁰ 5.59e⁴     0     0   587    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    -  1.00 -12.5      0      0       1.00  14.3   1.26e⁴  29.0       0     0     0     0  132     29.0  
 3 US000         1.00   1125   16.0      0  2000  3.00 ""        2.00 8.00e⁰ 8.00e⁰ 8.00e⁰ 5.99e⁴     0     0   576    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0       1.00  14.3      1.00  14.3     1.00  14.3   9.7419.4       0     0     0     0   25.0    4.50 
 4 US000         1.00   1125   16.0      0  2000  4.00 ""        3.00 8.00e⁰ 8.00e⁰ 8.00e⁰ 6.42e⁴     0     0   617    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  1.00  50.0     1.00  14.3      1.00  14.3     1.00  14.3   1.34e⁴  26.4       0     0     0     0   59.0   10.6  
 5 US000         1.00   1131   16.0      0  2000  1.00 ""       78.0  4.093.923.933.66e⁷     0     0   706    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA -1.00 - 1.30 -237    - 5.50  -217    - 5.20 -282    - 6.70 -1.22e⁵ - 0.300     0     0     0     0   38.0    5.70 
 6 US000         1.00   1131   16.0      0  2000  2.00 ""       78.0  4.194.575.264.77e⁷     0     0   786    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  1.00   1.30 -315    - 7.00  -255    - 5.30  250      5.00  7.70e⁶  19.2       0     0     0     0  142     22.0  
 7 US000         1.00   1131   16.0      0  2000  3.00 ""       78.0  5.855.675.475.86e⁷     0     0   796    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    - 97.0  - 1.60  -167    - 2.90 -110    - 2.00 -1.43e⁶ - 2.40      0     0     0     0 -  1.00 - 0.100
 8 US000         1.00   1131   16.0      0  2000  4.00 ""       78.0  4.844.494.435.09e⁷     0     0   853    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA  0      0    -502    - 9.40  -146    - 3.10  205      4.80 -1.07e⁵ - 0.200     0     0     0     0   25.0    3.00 
 9 US000         1.00   2211   16.0      0  2000  1.00 ""      140    1.23e⁴ 1.23e⁴ 1.23e⁴ 1.91e⁸     0     0  1193    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA -5.00 - 3.40 -201    - 1.60  -157    - 1.30 -125    - 1.00  1.54e⁷   8.80      0     0     0     0  110     10.2  
10 US000         1.00   2211   16.0      0  2000  2.00 ""      141    1.24e⁴ 1.24e⁴ 1.24e⁴ 1.83e⁸     0     0  1138    NA  1.00  1.00  1.00  1.00  1.00     0     0  1.00    NA -2.00 - 1.40   49.0    0.400    5.00   0       6.00   0     2.54e⁷  16.1       0     0     0     0  157     16.0  
# ... with 18,018 more rows

Now let's download the data only at annual frequency (similar to the size cuts from earlier):

dt_naics_year <- get_qcew_cut(     
  data_cut = 16,     
  industry = "naics",  frequency = "year",  
  year_start = 2000, year_end = 2005,    
  path_data = "~/Downloads/", write = F)
as_tibble(dt_naics_year)
# A tibble: 4,507 x 38
   area_fips own_code industry_code agglvl_code size_code  year   qtr disclosure_code annual… annual… total_… taxab… annua… annua… avg_a… lq_di… lq_a… lq_a… lq_t… lq_t… lq_a… lq_a… lq_a… oty_… oty_… oty_a… oty_an… oty_ann… oty_to… oty_tot… oty_… oty_… oty_… oty_… oty_a… oty_a… oty_… oty_a…
   <chr>        <dbl>         <dbl>       <dbl>     <dbl> <dbl> <dbl> <chr>             <dbl>   <dbl>   <dbl>  <dbl>  <dbl>  <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>   <dbl>    <dbl>   <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl> <dbl>  <dbl>
 1 US000         1.00          1125        16.0         0  2000    NA ""                 2.00  8.00e⁰  2.32e⁵      0      0    589  30648     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0     1.00e⁰   14.3    3.38e⁴   17.0       0     0     0     0   68.0  13.1   3567  13.2 
 2 US000         1.00          1131        16.0         0  2000    NA ""                78.0   4.731.94e⁸      0      0    788  40994     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -1.56-  3.20   6.04e⁶    3.20      0     0     0     0   49.0   6.60  2547   6.60
 3 US000         1.00          2211        16.0         0  2000    NA ""               142     1.24e⁴  7.90e⁸      0      0   1222  63557     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA -2.00 - 1.40 -5.10-  0.400  8.53e⁷   12.1       0     0     0     0  136    12.5   7096  12.6 
 4 US000         1.00          2213        16.0         0  2000    NA ""                10.0   9.372.67e⁷      0      0    547  28463     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  1.00  11.1   1.1413.9   -4.72e⁶ - 15.0       0     0     0     0 -186   -25.4  -9643 -25.3 
 5 US000         1.00          2362        16.0         0  2000    NA ""                 1.00  1.00e⁰  6.73e⁴      0      0   1294  67270     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0     1.00e⁰  100      4.32e⁴  180         0     0     0     0  184    16.6   9555  16.6 
 6 US000         1.00          3231        16.0         0  2000    NA ""                 8.00  4.862.69e⁸      0      0   1063  55287     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -6.10-  1.20   8.19e⁵    0.300     0     0     0     0   16.0   1.50   852   1.60
 7 US000         1.00          3259        16.0         0  2000    NA ""                 3.00  3.101.67e⁷      0      0   1037  53945     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -2.00e⁰ -  0.600  1.30e⁶    8.40      0     0     0     0   84.0   8.80  4396   8.90
 8 US000         1.00          3321        16.0         0  2000    NA ""                 3.00  4.753.06e⁷      0      0   1241  64540     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  1.00  50.0   1.2636.1    9.43e⁶   44.5       0     0     0     0   72.0   6.20  3748   6.20
 9 US000         1.00          3329        16.0         0  2000    NA ""                10.0   1.23e⁴  6.24e⁸      0      0    976  50746     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -1.32-  9.70   5.99e⁶    1.00      0     0     0     0  104    11.9   5379  11.9 
10 US000         1.00          3364        16.0         0  2000    NA ""                 1.00  2.606.86e⁷      0      0    508  26390     NA  1.00  1.00  1.00     0     0  1.00  1.00    NA  0      0    -4.50-  1.70  -1.19e⁷ - 14.8       0     0     0     0 - 77.0 -13.2  -4048 -13.3 
# ... with 4,497 more rows

Tidy the data

The data is still raw from the BLS download. We can clean the data in two ways: a monthly dataset with employment by categories and a quarterly dataset with only number of firms. Finally we allow an option for aggregating both types. Note this function works only for the quarterly version of the data.

dt_naics_tidy <- tidy_qcew(dt_naics, frequency = "month")
as_tibble(dt_naics_tidy)
# A tibble: 54,084 x 9
    year quarter industry_code area_fips own_code size_code agglvl_code month emplvl
   <dbl>   <dbl>         <dbl> <chr>        <dbl>     <dbl>       <dbl> <dbl>  <dbl>
 1  2000    1.00          1111 US000         5.00         0        16.0  1.00  21243
 2  2000    1.00          1111 US000         5.00         0        16.0  2.00  21759
 3  2000    1.00          1111 US000         5.00         0        16.0  3.00  23600
 4  2000    2.00          1111 US000         5.00         0        16.0  4.00  26142
 5  2000    2.00          1111 US000         5.00         0        16.0  5.00  27287
 6  2000    2.00          1111 US000         5.00         0        16.0  6.00  29355
 7  2000    3.00          1111 US000         5.00         0        16.0  7.00  39098
 8  2000    3.00          1111 US000         5.00         0        16.0  8.00  33418
 9  2000    3.00          1111 US000         5.00         0        16.0  9.00  30534
10  2000    4.00          1111 US000         5.00         0        16.0 10.0   30475
# ... with 54,074 more rows

Backup: what if the BLS has changed my data!

I included an option to specify a snapshot of the data take through the url wayback machine: thanks To Gabriel Chodorow-Reich for suggesting the idea.1 Specify a suffix found from the internet wayback machine on the BLS website and add it as an argument in the download function:

# Enter the address directly from the html address
wayback_suffix <- "https://web.archive.org/web/20141101135821"  # suffix for November 1st, 2014
# Or enter the exact date at which date it has been crawled:
wayback_suffix <- 20141101135821  

# execute adding the wayback option
dt_naics <- get_qcew_cut(
  data_cut = 28,
  industry = "naics",
  year_start = 1990,  year_end = 2015,       # try not to ask for data from the future        
  url_wayback = wayback_suffix,
  path_data = "~/Downloads/", write = F)
dt_naics

Documentation

Cuts in QCEW

NAICS Aggregation levels

The table of cuts looks as the following for NAICS based industries is as follows. See the reference table and official BLS documentation for more details.

#------------------------------------------------------------------------------------------------------------------------------
#|                           | Geographic/Size   | National | National | CSA | MSA | Statewide | Statewide | County | MicroSA |
#|                           |  LevelCode Format |          |  by size |     |     |           |  by size  |        |         |
#|---------------------------|-------------------|----------|--- ------|-----|-----|-----------|-----------|--------|---------|
#| Industry Ownership Level  |                   | 1x       | 2x       | 3x  | 4x  | 5x        | 6x        | 7x     | 8x      |
#|                           |                   |          |          |     |     |           |           |        |         |
#| Total, All Ownerships     | x0                | 10       | -        | 30  | 40  | 50        | -         | 70     | 80      |
#| Total, by Ownership       | x1                | 11       | 21       | -   | 41  | 51        | 61        | 71     | -       |
#| Domain, by Ownership      | x2                | 12       | 22       | -   | 42  | 52        | 62        | 72     | -       |
#| SuperSector, by Ownership | x3                | 13       | 23       | -   | 43  | 53        | 63        | 73     | -       |
#| Sector, by Ownership      | x4                | 14       | 24       | -   | 44  | 54        | 64        | 74     | -       |
#| 3-digit, by Ownership     | x5                | 15       | 25       | -   | 45  | 55        | -         | 75     | -       |
#| 4-digit, by Ownership     | x6                | 16       | 26       | -   | 46  | 56        | -         | 76     | -       |
#| 5-digit, by Ownership     | x7                | 17       | 27       | -   | 47  | 57        | -         | 77     | -       |
#| 6-digit, by Ownership     | x8                | 18       | 28       | -   | 48  | 58        | -         | 78     | -       |
#------------------------------------------------------------------------------------------------------------------------------

SIC Aggregation levels

Be careful as the SIC aggregation levels are different than for NAICS. See the reference table and official BLS documentation for more details.

The division follows


What is available

The data file layout for a general view of what is available:

NAICS files

Some cuts are limited to an annual frequency (first quarter only) while some have a quarterly frequency (with monthly reports for employment)

SIC files

Same as for NAICS. Collections stops in 2000. Size starts only in 1997.

Availability Warning

The BLS Table can be misleading at times. And all of the files are not created equals. I am working on harmonizing all of the data pulls but there seems to be a lot of exceptions. Here are warnings I found important:

  1. There is no disaggregated data for size X industry before 1990 in the case of naics. The layout of files on the download page is somewhat misleading.
  2. For the sic by industry files the data from standard sources only goes back to 1984. I have yet to implement the function that also downloads and get the data from 1975 to 1983. They are arranged in individual industry files.
    • For the year 1983 the available cuts are: 01, 02, 03, 04, 05, 06, 13, 14, 15, 18, 19, 20, 21, 22, 23, 26, 27, 28, 29
    • For the year 1984 the available cuts are: 01, 02, 03, 04, 05, 06, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 26, 27, 28, 29, 30, 31
    • The missing cuts pre-1984 are: 16, 17 (MSA 3 and 4 digits SIC) and 30, 31 (County 3 and 4 digits SIC)

General documentation

The table of contents to download directly datasets

The data file layout for a general view of what is available:

Data codes

Aggregation levels

Size Classes

Industries

Industry titles are standard in that case

Ownership

Ownership codes go from 0 to 6:

Online docs for naics and sic

Area codes and titles (FIPS)

Structures somewhat like industry codes. 5 characters. US000 is aggregated over the total US. Then for example XXYYY can be split in two parts:

BLS documentation is available at the following links:


1: The stata version of this code is on Gabe's website here


(c) Erik Loualiche



eloualiche/entrydatar documentation built on Aug. 20, 2023, 1:56 p.m.