Labour For Survey data set: I think there is a row for each recipient and the following columns: INDC07M and INDC07S will contain a sic code to flag the recipient is employed in that industry. INECAC05 and SECJMBR flag whether the job is employed (1) or self-employed (2), for main and second jobs respectively. There are various categories like sex, age nationality, etc for the different breakdowns. for all the industry/occupation sic codes, they only appear when row/participant is employed in that sic
INDC07M: industry sic codes (4 digit) to flag the recipient has main employment in that industry INDC07S: industry sic codes (4 digit) to flag the recipient has second employment in that industry SOC10M: occupation code main job SOC10S: occupation code second job INECAC05: main job - 1 for employee, 2 for self employed SECJMBR: second job - 1 for employee, 2 for self employed PWTA16: population weighting INDSC07M:industry subclass sic codes (5 digit) INDSC07S:industry subclass sic codes (5 digit) SEX: code for gender categories ETHUK11: code for ethnicity categories
DCMS_main: flag if INDC07M is one of our sics DCMS_second: flag if INDC07S is one of our sics
INDC07M INDC07S SOC10M SOC10S INECAC05 SECJMBR PWTA16 INDSC07M INDSC07S SEX ETHUK11 1 6622 1115 1 5 260 66220 1 1 2 25 382 2 1 3 4778 9233 1 5 371 47789 2 1 4 34 304 2 1 5 8020 7211 1 5 476 80200 2 1 6 7112 2126 1 5 380 71129 1 1
DCMS_main and DCMS_second are then set to 1 if these sic flags exist AND IS IN OUR LIST, and then weighted up with PWTA16 for the general population.
with weighted counts: INDC07M INDC07S SOC10M SOC10S INECAC05 SECJMBR PWTA16 INDSC07M INDSC07S SEX dcms_code ETHUK11 DCMS_main DCMS_second 1 6622 1115 1 5.00 260 66220 1 1 Whilte 0 0 2 25 NA 382 2 1 Whilte 0 0 3 4778 9233 1 5.00 371 47789 2 1 Whilte 0 0 4 34 NA 304 2 1 Whilte 0 0 5 8020 7211 1 5.00 476 80200 2 1 Whilte 0 0 6 7112 2126 1 5.00 380 71129 1 1 Whilte 0 0
For each breakdown we then sum DCMS_main and DCMS_second (after weighting) by the cat and either INDC07M or INDC07S (short SICs)(and rename them to SIC), to get our 4 columns, which we then join together: # A tibble: 6 x 9 # Groups: SIC [2] SIC ETHUK11 M_E_DCMS M_SE_DCMS S_E_DCMS S_SE_DCMS 1 1820 Asian / Asian British 545 0 0 0 2 1820 Whilte 45425 4543 0 0 3 2611 Asian / Asian British 5435243 453245 0 0 4 2611 Black / African / Caribbean / Black British 5454 0 0 0 5 2611 Mixed / Multiple ethnic groups 4545 0 0 0 6 2611 Whilte 234545 6565 0 134
M_E_DCMS: Main job, employee - all jobs - weighted. (INECAC05 == 1 & DCMS_main == 1) M_SE_DCMS: Main job, self employed - all jobs - weighted. (INECAC05 == 2 & DCMS_main == 1) S_E_DCMS: Second job, employee - all jobs - weighted. (SECJMBR == 1 & DCMS_second == 1) S_SE_DCMS: Second job, self-employed - all jobs - weighted. (SECJMBR == 2 & DCMS_second == 1)
for some stuff: Then the above tables are joined on to the table format where there is a column of (4digit) sics, and then a flag for each sector, but the columns added are only counts for employed, self-employed, and total. does not differentiate between main and second job.
We then simply join all these tables together into a very wide table with first the column of sics, then columns for all age levels for employed, then self-employed, then total, then similarly for the ethnicity levels etc. so for each category n columns = levels * 3
for category breakdowns: make a category and sic string lookup 19-251820 in spss excel output then lookup this to append to a different sector * table for each category which each has a sector * e/se. these summary tables are then used to compile a sector * e/se + cat table, or whatever the final table format is.
the main excel file has the normal time series table for total jobs, index; then split by employed, self-employed and for each sub sector by uk region. then there is threeway category + employed/self-employed * sector, with slight variations like different orientations or not splitting e/se. Then there is an excel file for each subsector, similar to above but the category breakdowns are time series, mostly the same format, with no e/se split. then there is the individual classification which is just time series for each sic finally, there is 4 tables of breakdowns by occupation.
G:\Economic Estimates\2017 publications\July publication\Final documents to publish
simply has all the excel files which are online, I believe the values are copy paste valued from:
G:\Economic Estimates\Employment - Helen\2016\2016 Main Workbook.xlsx
- so this is what we need to focus on
choose an aspect to start with first, and then just keep on working towards it until I have the output
see chapter the for all emplyment stuff in pdf report
only the first two tables are time series
most of these are by Ejobs, Ejobs%, SEjobs, SEjobs%, Total jobs, followed by some percentages
For the sector summaries, this is
3.2 - Employed-selfemployed Compiled data 20XX sector * e/se with %
3.3 - Region (000's) Regions - all DCMS sectors region * e/se with % also "% of DCMS sector jobs in all regions", "% of all jobs in region"
The below sector sheets are all also region * e/se with % also "% of XXSectorXX jobs in all regions" Gambling and telecoms don't have E and SE columns
3.3a - Civil Society Civil Society - regional
3.3b - Creative Industries CI - regional
3.3c - Cultural Sector Cultural sector - regional
3.3d - Digital Sector Digital Sector - regional
3.3e - Gambling Gambling - regional
3.3f - Sport Sport - regional
3.3g - Telecoms Telecoms - regional
3.4 - Nationality (000's) EU breakdown nationality + e/se * sector
3.5 - Gender (000's) Gender breakdown sector * e/se + gender with %
3.6 - Ethnicity (000's) Ethnicity breakdown sector * ethnicity with %
3.7 - Age (000's) possibly from "Age breakdown" but there is less categories, not sure where this calculation was done sector * e/se + age
3.8 - Qualification (000's) Qualification breakdown sector * qualification
3.9 - Fulltime Parttime (000's) Full time Part time breakdown sector * ftpt with %
3.10 - NS-SEC (000's) probably "NSSEC breakdown" but has been recategorised as first four columns and second four columns sector * e/se + nssec
3.11 - SIC (000's) "Individual SIC" even though some annoymisation had been done, there is more done in the final table below sic + description * e/se
anon any value < 6
basically we will round everything under 6k down to zero, rather then just everything under 500.
for each row group with a total (or percentage of total, but I think in this case there is always a total), anon everything if (number of elements) - (number of anon) <= 1
then, if there is a total row group, anon elements in the total group if (number of row groups) - (number of corresponding elements in row groups which have been anonymised) <= 1
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.