Fit LeeCartertype models for rates to arbitrarily shaped observations of rates in a Lexis diagram.
Description
The LeeCarter model is originally defined as a model for rates
observed in Asets (age by period) of a Lexis diagram, as
log(rate(x,t)) = a(x) + b(x)k(t), using one parameter per age(x) and
period(t). This function uses natural splines for a(), b() and k(),
placing knots for each effect such that the number of events is the
same between knots. Also fits the continuous time counterparts of all
models supported by the lca.rh
function from the
ilc
package (see details).
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  LCa.fit( data, A, P, D, Y,
model = "APa", # or one of "ACa", "APaC", "APCa" or "APaCa"
a.ref, # age reference for the interactions
pi.ref = a.ref, # age reference for the period interaction
ci.ref = a.ref, # age reference for the cohort interaction
p.ref, # period reference for the intercation
c.ref, # cohort reference for the interactions
npar = c(a = 6, # no. knots for main ageeffect
p = 6, # no. knots for peroideffect
c = 6, # no. knots for cohorteffect
pi = 6, # no. knots for age in the period interaction
ci = 6), # no. knots for age in the cohort interaction
VC = TRUE, # numerical calculation of the Hessia?
alpha = 0.05, # 1 minus confidence level
eps = 1e6, # convergence criterion
maxit = 100, # max. no iterations
quiet = TRUE ) # cut the crap
## S3 method for class 'LCa'
print( x, ... )
## S3 method for class 'LCa'
summary( object, show.est=FALSE, ... )
## S3 method for class 'LCa'
plot( x, ... )
## S3 method for class 'LCa'
predict( object, newdata,
alpha = 0.05,
level = 1alpha,
sim = ( "vcov" %in% names(object) ),
... )

Arguments
data 
A data frame. Must have columns 
A 
Vector of ages (midpoint of observation). 
P 
Vector of period (midpoint of observation). 
D 
Vector of no. of events. 
Y 
Vector of persontime. Demographers would say "exposure", bewildering epidemiologists. 
a.ref 
Reference age for the ageinteraction term(s) 
pi.ref 
Same, but specifically for the interaction with period. 
ci.ref 
Same, but specifically for the interaction with cohort. 
p.ref 
Reference period for the timeinteraction term 
c.ref 
Reference period for the timeinteraction term 
model 
Character, either 
npar 
A (possibly named) vector or list, with either the number of knots or
the actual vectors of knots for each term. If unnamed, components are
taken to be in the order (a,b,t), if the model is "APaCa" in the order
(a,p,c,pi,ci). If a vector, the three integers indicate the number of
knots for each term; these will be placed so that there is an equal
number of events ( 
VC 
Logical. Should the variancecovariance matrix of the parameters be computed by numerical differentiation? See details. 
alpha 
1 minus the confidence level used when calculating
confidence intervals for estimates in 
eps 
Convergence criterion for the deviance, we use the the relative difference between deviance from the two models fitted. 
maxit 
Maximal number of iterations. 
quiet 
Shall I shup up or talk extensively to you about iteration progression etc.? 
object 
An 
show.est 
Logical. Should the estimates be printed? 
x 
An 
newdata 
Prediction data frame, must have columns 
level 
Confidence level. 
sim 
Logical or numeric. If 
... 
Additional parameters passed on to the method. 
Details
The LeeCarter model is nonlinear in age and time so does not fit
in the classical glmPoisson framework. But for fixed b(x)
it
is a glm, and also for fixed a(x)
, k(t)
. The function
alternately fits the two versions until the same fit is produced (same
deviance).
The multiplicative age by period term could equally well have been a multiplicative age by cohort or even both. Thus the most extensive model is:
\log(lambda(a,p)) = f(a) + b_p(a)k_p(a) + b_c(a)k_c(a)
log(lambda(a,p)) = f(a) + b_p(a)k_p(a) + b_c(a)k_c(a)
The naming convention for the models is a capital P
and/or
C
if the effect is in the model followed by a lower case
a
if there is an interaction with age. Thus there are 5 different
models that can be fitted: APa
, ACa
, APaC
APCa
and APaCa
.
The standard errors of the parameters from the two model fits are however
wrong; they are conditional on some of terms having a fixed value. And
the symbolic calculation of the Hessian is a nightmare, so this is done
numerically using the hessian
function from the numDeriv
package if VC=TRUE
.
The coefficients and the variancecovariance matrix of these are used
in predict.LCa
for a parametric bootstrap (that is, a simulation from
a multivariate normal with mean equal to the parameterestiamtes and
variance as the estimated variancecovariance) to get confidence
intervals for the predictions if sim
is TRUE
— which
it is by default if they are part of the object.
The plot
for LCa
objects merely produces between 3 and 5
panels showing each of the terms in the model. These are mainly for
preliminary inspection; real reporting of the effects should use
proper relative scaling of the effects.
Value
LCa.fit
returns an object of class LCa
(smooth
effects L
eeCa
rter model); it is a list with the
following components:
model 
Character, either 
ax 
3column matrix of ageeffects, c.i. from the agetime model. Rownames are the unique occurring ages in the dataset. Estimates are rates. 
pi 
3column matrix of ageperiod interaction effects, c.i. from the age
model. Rownames are the actually occurring ages in the
dataset. Estimates are multipliers of the logRRs in 
kp 
3column matrix of periodeffects, with c.i.s from the
agetime model. Rownames are the actually occurring times in the
dataset. Estimates are rateratios centered at 1 at 
ci 
3column matrix of agecohort interaction effects, c.i. from the age
model. Rownames are the actually occurring ages in the
dataset. Estimates are multipliers of the logRRs in 
kc 
3column matrix of periodeffects, with c.i.s from the agetime
model. Rownames are the actually occurring times in the
dataset. Estimates are rateratios centered at 1 at 
mod.at 

mod.b 

coef 
All coefficients from both models, in the order 
vcov 
Variancecovariance matrix of coefficients from both
models, in the same order as in the 
knots 
List of vectors of knots used in for the age, perido and cohort effects. 
refs 
List of reference points used for the age, period and cohort terms in the interactions. 
deviance 
Deviance of the model 
df.residual 
Residual degrees of freedom 
iter 
Number of iterations used to reach convergence. 
plot.LCa
plots the estimated effects in separate panels,
using a logscale for the baseline rates (ax
) and the timeRR
(kt
). For the APaCa
model 5 panels are plotted.
summary.LCa
returns (invisibly) a matrix with the parameters
from the models and a column of the conditional se.s and of the se.s
derived from the numerically computed Hessian (if LCa.fit
were
called with VC=TRUE
.)
predict.LCa
returns a matrix with one row per row in
newdata
. If LCa.fit
were called with VC=TRUE
there will be 3 columns, namely prediction (1st column) and c.i.s
based on a simulation of parameters from a multivariate normal with
mean coef
abd cariance vcov
using the median and
alpha
/2 quantiles from the sim
simulations. If
LCa.fit
were called with VC=FALSE
there
will be 6 columns, namely estimates and c.i.s from agetime model (mod.at
), and
from the ageinteraction model (mod.b
), both using conditional variances.
Author(s)
Bendix Carstensen, http://BendixCarstensen.com
See Also
apc.fit
,
apc.LCa
,
lca.rh
,
lca
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35  library( Epi )
# Load the male lung cancer data by Lexis triangles and
# rename age and period as required by LCa.fit
data( testisDK )
tc < subset( testisDK, A>14 & A<60 )
head( tc )
# We want to see rates per 100,000 PY
tc$Y < tc$Y / 10^5
# Fit the LeeCarter model with ageperiod interaction (default)
LCa.tc < LCa.fit( tc, model="ACa", a.ref=30, p.ref=1980, quiet=FALSE, eps=10e5, maxit=100 )
LCa.tc
summary( LCa.tc )
# Inspect what we got
names( LCa.tc )
# show the estimated effects
par( mfrow=c(1,3) )
plot( LCa.tc )
# Prediction data frame for ages 15 to 60 for three time points:
nd < data.frame( A=15:60 )
p50 < predict.LCa( LCa.tc, newdata=cbind(nd,P=1950), sim=10000 )
p70 < predict.LCa( LCa.tc, newdata=cbind(nd,P=1970), sim=10000 )
p90 < predict.LCa( LCa.tc, newdata=cbind(nd,P=1990), sim=10000 )
# Inspect the curves from the parametric bootstrap (simulation):
par( mfrow=c(1,1) )
matplot( nd$A, cbind(p50,p70,p90), type="l", lwd=c(6,3,3), lty=1,
col=rep( c("red","green","blue"), each=3 ), log="y",
ylab="Testis cancer incidence per 100,000 PY, 1970, 80, 90", xlab="Age" )
