readHMD: 'readHMD()' reads a standard HMD .txt table as a 'data.frame'

View source: R/readHMD.R

readHMDR Documentation

readHMD() reads a standard HMD .txt table as a data.frame

Description

This calls read.table() with all the necessary defaults to avoid annoying surprises. The Age column is also stripped of "+" and converted to integer, and a logical indicator column called OpenInterval is added to show where these were located. If the file contains population counts, values are split into two columns for Jan 1 and Dec 31 of the year. Output is invisibly returned, so you must assign it to take a look. This is to avoid lengthy console printouts.

Usage

readHMD(filepath, fixup = TRUE, ...)

Arguments

filepath

path or connection to the HMD text file, including .txt suffix.

fixup

logical. Should columns be made more user-friendly, e.g., forcing Age to be integer?

...

other arguments passed to read.table, not likely needed.

Details

Population counts in the HMD typically refer to Jan 1st. One exception are years in which a territorial adjustment has been accounted for in estimates. For such years, 'YYYY-' refers to Dec 31 of the year before the adjustment, and 'YYYY+' refers to Jan 1 directly after the adjustment (adjustments are always made Jan 1st). In the data, it will just look like two different estimates for the same year, but in fact it is a definition change or similar. In order to remove headaches from potential territorial adjustments in the data, we simply create two columns, one for January 1st (e.g.,"Female1") and another for Dec 31st (e.g.,"Female2") . One can recover the adjustment coefficient for each year by taking the ratio $$Vx = P1(t+1) / P2(t)$$. In most years this will be 1, but in adjustment years there is a difference. This must always be accounted for when calculating rates and exposures. Argument fixup is outsourced to HMDparse().

Value

data.frame of standard HMD output, except the Age column has been cleaned, and a new open age indicator column has been added. If the file is Population.txt or Population5.txt, there will be two columns each for males and females.

Note

function written by Tim Riffe.


HMDHFDplus documentation built on July 9, 2023, 6:26 p.m.