rhdata: Data formatting utility for the extended (Stratified) LC...

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

It creates rhdata class object suitable for fitting the extended SLC model using elca.rh iterative fitting method. Basically, it transforms a two-dimensional survival data into three-dimensional arrays of population (exposure) and mortality rates dependent on age, calendar time and additional covariate(s).

Usage

1
2
3
rhdata(dat, covar, xbreaks = 60:96, xlabels = 60:95, 
		ybreaks = mdy.date(1, 1, 1999:2008), ylabels = 1999:2007, 
		name = NULL, label = NULL)

Arguments

dat

data.frame containing individual observations of survival data along with values of additional covariate(s). Thus, the data set needs to contain the following named columns of individual survival records: - 'event' = binary value corresponding to the survival event (1 - fail/death or 0 - survive); - 'dob' = Julian date corresponding to the date of birth (or origin) of the survival time; - 'dev' = Julian date corresponding to date of event (or end) of the survival time. In addition, there should be at least one extra column corresponding to observations related to any additional covariate(s) (e.g. socio-economic factors).

covar

(partial) covariate name(s) or position number(s) in the dat data set. The covariate(s) must be of class 'factor'.

xbreaks

a sequence of age break points (including the starting and closing values) to be used for sub-grouping the input data set dat in order to calculate age-specific exposures and mortality rates. By default, it is set to 60:96 that corresponds to integer ages between 60 - 95.

xlabels

a sequence of age labels to be used for the sequence defined in xbreaks.

ybreaks

a sequence of year break points (as Julian calendar dates) to be used for sub-grouping the input data set dat in order to calculate year-specific exposures and mortality rates. By default, it is set to mdy.date(1, 1, 1999:2008) that corresponds to whole years between 1st of January of years 1999 - 2008.

ylabels

a sequence of year labels to be used for the sequence defined in ybreaks.

name

name of subset data series (e.g. male, female or total)

label

label (name) of overall data source (e.g. CMI)

Details

While the rhdata function can sub-group the input data by more than one additional covariates (possibly useful for other preliminary analysis), the fitting method implemented in elc.rh can only handle a single additional covariate. Also, currently, there are no generic methods to plot or to extract parts of the rhdata class object, but there are a few illustrations provided below how these might be carried out.

Value

List object defined as class rhdata made up by the following components:

year

vector of year labels

age

vector of age labels

covariates

vector of levels of the additional covariate

deaths

3-dimensional array of number of deaths (by age-year-covariate)

pop

3-dimensional array of population (exposure) (by age-year-covariate)

mu

3-dimensional array of central mortality rates (by age-year-covariate)

label

label (name) of overall data source

name

name of subset data series

Author(s)

Z. Butt and S. Haberman and H. L. Shang

References

Renshaw, A. E. and Haberman, S. (2003a), “Lee-Carter mortality forecasting: a parallel generalised linear modelling approach for England and Wales mortality projections", Journal of the Royal Statistical Society, Series C, 52(1), 119-137.

Renshaw, A. E. and Haberman, S. (2003b), “Lee-Carter mortality forecasting with age specific enhancement", Insurance: Mathematics and Economics, 33, 255-272.

Renshaw, A. E. and Haberman, S. (2006), “A cohort-based extension to the Lee-Carter model for mortality reduction factors", Insurance: Mathematics and Economics, 38, 556-570.

Renshaw, A. E. and Haberman, S. (2008), “On simulation-based approaches to risk measurement in mortality with specific reference to Poisson Lee-Carter modelling", Insurance: Mathematics and Economics, 42(2), 797-816.

Renshaw, A. E. and Haberman, S. (2009), “On age-period-cohort parametric mortality rate projections", Insurance: Mathematics and Economics, 45(2), 255-270.

See Also

elca.rh, dd.rfp, demogdata, mdy.date

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# See data set 'tab' provided in the ilc package
# names(tab)
# [1] "refno" "dob"   "dev"   "event" "cov1"  "cov2"
# Get multidimensional survival data: 
mdat <- rhdata(tab, covar='cov2', xbreaks=60:96, xlabels=60:95,
  ybreaks=mdy.date(1,1,2000:2006), ylabels=2000:2005, name='M', label='CMI')
# Warning: although rhdata() can sort by more than a single parameter, for ex.
#   covar=c('cov1', 'cov2'), the SLC fitting only works at the moment with
#   a single extra covariate.

# print data summary:
mdat
#Multidimensional Mortality data for: MDat [M] 
#Across covariates:
#         years: 2000 - 2005
#         ages:  60 - 95
#         cov2: 0, 1, 2, 3
# Graphical illustrations of mdat data levels (by the additional factor):
# plot of exposures:
matplot(mdat$age, mdat$pop[,,1], type='l', xlab='Age', ylab='Ec', main='Base Level')
matplot(mdat$age, mdat$pop[,,2], type='l', xlab='Age', ylab='Ec', main='Level 1')
# plot of deaths:
matplot(mdat$age, mdat$deaths[,,1], type='l', xlab='Age', ylab='D', main='Base Level')
matplot(mdat$age, mdat$deaths[,,2], type='l', xlab='Age', ylab='D', main='Level 1')
# plot of log mortality rates:
matplot(mdat$age, log(mdat$mu[,,1]), type='l', xlab='Age', ylab='log(mu)', main='Base Level')
matplot(mdat$age, log(mdat$mu[,,2]), type='l', xlab='Age', ylab='log(mu)', main='Level 1')

ilc documentation built on May 2, 2019, 5:07 a.m.