Home

/

GitHub

/

In numbats/yowie: Longitudinal Wages Data from the National Longitudinal Survey of Youth 1979

```{css, echo = FALSE} .source { font-size: 0.7em; background-color: #ffcc66; padding: 10px; }

.source .sourceCode { background-color: #ffffe6 }

```r
library(knitr)
opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  cache = TRUE,
  cache.path = "cache/",
  warning = FALSE,
  message = FALSE
)
# work around as per https://github.com/yihui/knitr/issues/1647
rc <- read_chunk
rc(here::here("data-raw/data-preprocessing.R"))

The data cleaning step is conducted using the suite of packages in tidyverse.

Reading the data {#read-data}

Click here for the source file to read data (note this is quite long)

The code below is provided by the NLSY79 database to do the reading and initial processing of the data. Note that we did not modify this script except for the location of the file.

The above source code creates a data set new_data_qnames and categories_qnames. As shown below, the column names contain information on the job number (HRP1 = job 1, HRP2 = job 2, ..., HRP5 = job 5) and the year information.

str(categories_qnames, list.len = 20)

Demographic variables

Tidying the date of birth data

The month and year of birth is recorded in 1979 and 1981 for each individual. The records in 1981 are missing for some individuals so we take the month and year of birth from 1979 records.

Where the record is present for both 1979 and 1981, we check that the record matches.

cat("All birth month and year recorded in 1979 and 1981 match.")

cat("The birth record does not match for the following individuals.
")
dob_tidy %>%
  filter(dob_conflict)

as_tibble(dob_tidy)

Getting the race and sex data

as_tibble(demog_tidy)

Tidying the education data

as_tibble(demog_education)

Getting the highest year completed

as_tibble(highest_year)

`demog_nlsy79`

as_tibble(demog_nlsy79)

Tidying the employment information

as_tibble(hours_all)

as_tibble(rates_all)

as_tibble(st_work)

as_tibble(exp)

as_tibble(hours_wages)

Subsetting to the high school population

as_tibble(wages_demog)
as_tibble(wages_before)

numbats/yowie documentation built on June 7, 2022, 10:29 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

numbats/yowie
Longitudinal Wages Data from the National Longitudinal Survey of Youth 1979

In numbats/yowie: Longitudinal Wages Data from the National Longitudinal Survey of Youth 1979

Reading the data {#read-data}

Demographic variables

Tidying the date of birth data

Getting the race and sex data

Tidying the education data

Getting the highest year completed

`demog_nlsy79`

Tidying the employment information

Subsetting to the high school population

R Package Documentation

Browse R Packages

We want your feedback!

numbats/yowie Longitudinal Wages Data from the National Longitudinal Survey of Youth 1979

In numbats/yowie: Longitudinal Wages Data from the National Longitudinal Survey of Youth 1979

Reading the data {#read-data}

Demographic variables

Tidying the date of birth data

Getting the race and sex data

Tidying the education data

Getting the highest year completed

demog_nlsy79

Tidying the employment information

Subsetting to the high school population

R Package Documentation

Browse R Packages

We want your feedback!

numbats/yowie
Longitudinal Wages Data from the National Longitudinal Survey of Youth 1979

`demog_nlsy79`