Summary: In this notebook, we create the BUNMD file by unifying the cleaned and "condensed" Numident (deaths, applications, and claims) into a single file.

The function unify_numident combines the information from the deaths, applications, and claims files into a file with one record per person: the Berkeley Unified Numident Mortality Database (BUNMD).

The functions "create_weights" create post-stratification weights to the HMD for the BUNMD Sample 1 and Sample 2.

library(here)
library(ipumsr)

source(here("R/create_bunmd.R"))
source(here("R/create_weights_bunmd.R"))
source(here("R/create_weights_bunmd_complete.R"))
## read in cleaned and "condensed" numident files
claims <- fread("/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/3_numident_files_cleaned_and_condensed/claims_condensed.csv")
deaths <- fread("/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/3_numident_files_cleaned_and_condensed/deaths_cleaned.csv")
applications <- fread("/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/3_numident_files_cleaned_and_condensed/apps_condensed.csv")

## combine records into one file
bunmd <- create_bunmd(claims = claims, applications = applications, deaths = deaths)

## construct weights for BUNMD 
bunmd <- create_weights_bunmd(bunmd)
bunmd <- create_weights_bunmd_complete(bunmd)
## create string variables 
ddi_extract <- read_ipums_ddi("/global/scratch/p2p3/pl1_demography/censoc/miscellaneous/ipums_1940_extract")

## extract geo codes
geo_codes <- ipums_val_labels(ddi_extract, BPLD) 

## join geocodes for birthplace 
bunmd <- bunmd %>% 
  left_join(geo_codes %>% 
               select(bpl = val, bpl_string = lbl)) %>% 
  left_join(geo_codes %>% 
               select(socstate = val, socstate_string = lbl))

## write out BUNMD file
fwrite(bunmd, "/global/scratch/p2p3/pl1_demography/censoc_internal/data/numident/4_berkeley_unified_mortality_database/bunmd.csv")

Recap of the decision rules we used to create the BUNMD:



caseybreen/censocdev documentation built on June 11, 2025, 9:13 p.m.