knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(magick)
library(tidyverse)
library(tidyABS)

Tidying excel sheets from The Survey of Earned Doctorates

Here are a couple of examples using tidyABS on a non-ABS excel table.

Table 12. Doctorate recipients, by major field of study: Selected years, 1987–2017

Minimal code

Here's a mininal example.

tidyABS_example("PhD_major-field.xlsx") %>%
  process_sheet(path = ., sheets = 1, manual_value_references = "B6:O50") %>%
  change_direction("row_group_01", "WNW") %>%
  change_direction("row_group_02", "WNW") %>%
  assemble_table_components() %>%
  str()
Step-through

First, read in the sheet, specifying table corners.

phd_field_df_components <-
  tidyABS_example("PhD_major-field.xlsx") %>%
  process_sheet(path = ., sheets = 1, manual_value_references = "B6:O50")

Check the orietation of cells and make corrections.

phd_field_df_components %>% plot_table_components()

phd_field_df_components <-
  phd_field_df_components %>%
  change_direction("row_group_01", "WNW") %>%
  change_direction("row_group_02", "WNW")

Check the structure of the table implied by phd_field_df_components against the original excel table. If tidyABS has not identified row and column names correctly, the output below will not line up with the original excel table.

phd_field_df_components %>% reconstruct_table() 

Using View() is useful here:

magick::image_read("phd_field_df_components_view.png")

Assemble.

phd_field_df <-
  phd_field_df_components %>%
  assemble_table_components()

phd_field_df %>% glimpse()

Table 22. Doctorate recipients, by subfield of study, citizenship status, ethnicity, and race: 2017

Minimal code

Here's a mininal example.

tidyABS_example("PhD_ subfield-citizenship-status-ethnicity-race.xlsx") %>%
  process_sheet(path = ., sheets = 1, manual_value_references = "B7:L277") %>%
  change_direction("row_group_01", "WNW") %>%
  change_direction("row_group_02", "WNW") %>%
  change_direction("row_group_03", "WNW") %>%
  change_direction("row_group_04", "WNW") %>%
  assemble_table_components() %>%
  glimpse()
Step-through

First, read in the sheet, specifying table corners.

phd_background_components <-
  tidyABS_example("PhD_ subfield-citizenship-status-ethnicity-race.xlsx") %>%
  process_sheet(path = ., sheets = 1, manual_value_references = "B7:L277")

Check the orietation of cells and make corrections.

phd_background_components %>% plot_table_components() + ylim(-30, 0)

phd_background_components <-
  phd_background_components %>%
  change_direction("row_group_01", "WNW") %>%
  change_direction("row_group_02", "WNW") %>%
  change_direction("row_group_03", "WNW") %>% 
  change_direction("row_group_04", "WNW")

Check the structure of the table implied by phd_field_df_components against the original excel table. If tidyABS has not identified row and column names correctly, the output below will not line up with the original excel table.

phd_background_components %>% reconstruct_table()  

Using View() is useful here:

magick::image_read("phd_background_components_view.png")

Assemble.

phd_background_df <-
  phd_background_components %>%
  assemble_table_components()


phd_background_df %>% glimpse()


ianmoran11/tidyABS documentation built on May 30, 2019, 4:03 a.m.