In caseybreen/wcensoc:

In this notebook, we take the Numident files with the three .csv files (apps, claims, deaths) with the original NARA variable names and take several steps to clean and harmonize te data:

Rename variable names
Recode values to NA ("0", "", " ", "unk", "un", "unknown" were all recoded to 0).
Add a "socstate" variable for the death records, reporting the state in which the Social Security number was issued
Remove all masked records (e.g. zzzz records)

Library Packages

library(censocdev)
library(tidyverse)
library(data.table)

Clean and write out death files

## clean deaths AND add socstate BPL variable
deaths <- clean_deaths(numdeath.file.path = "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/1_numident_files_with_original_varnames/deaths_original.csv",
                       ssn_state_crosswalk_path = "/global/scratch/p2p3/pl1_demography/censoc/crosswalks/ssn_to_state_crosswalk.csv")

## write out cleaned death records
fwrite(deaths, "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/2_numident_files_cleaned/deaths_cleaned.csv")
fwrite(deaths, "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/3_numident_files_cleaned_and_condensed/deaths_cleaned.csv")

Clean and write out application files

one fread table gives an ominous warning — but this can be ignored. TODO suppress / change code to avoid warning

apps <- clean_apps(numapp.file.path = "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/1_numident_files_with_original_varnames/applications_original.csv")

## write out cleaned application files
fwrite(apps, "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/2_numident_files_cleaned/apps_cleaned.csv")

Clean and write out claims files

one fread table gives an ominous warning — but this can be ignored. TODO suppress / change code to avoid warning

claims <- clean_claims(numclaim.file.path = "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/1_numident_files_with_original_varnames/claims_original.csv")

## write out cleaned claims files
fwrite(claims, "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/2_numident_files_cleaned/claims_cleaned.csv")

caseybreen/wcensoc documentation built on June 11, 2025, 4:14 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

caseybreen/wcensoc

In caseybreen/wcensoc:

R Package Documentation

Browse R Packages

We want your feedback!