Mort2Dsmooth: Fit Two-dimensional Poisson P-splines
In MortalitySmooth: Smoothing and Forecasting Poisson Counts with P-Splines

Description Usage Arguments Details Value Author(s) References See Also Examples

Returns an object of class Mort2Dsmooth which is a two-dimensional P-splines smooth of the input data of degree and order fixed by the user. Specifically tailored to mortality data.

Mort2Dsmooth(x, y, Z, offset, W, overdispersion=FALSE, 
             ndx = c(floor(length(x)/5), floor(length(y)/5)), 
             deg = c(3, 3), pord = c(2, 2), 
             lambdas = NULL, df = NULL, method = 1,
             coefstart = NULL,
             control = list())

`x`	vector for the abscissa of data. These must be at least `2 ndx[1] + 1` of them.
`y`	vector for the ordinate of data. These must be at least `2 ndx[2] + 1` of them.
`Z`	matrix of counts response. Dimensions of `Z` should correspond to the length of `x` and `y`, respectively.
`offset`	matrix with an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric matrix of dimensions of `Z` or a single numeric value.
`W`	an optional matrix of weights to be used in the fitting process. This should be `NULL` or a numeric matrix of dimensions of `Z` or a single numeric value.
`overdispersion`	logical on the accounting for overdisperion in the smoothing parameters selection criterion. See `Details`. Default: FALSE.
`ndx`	vector with the number of internal knots -1 for each axis. Default: [`floor(length(x)/5)`, `floor(length(y)/5)`].
`deg`	vector with the degree of the B-splines for each axis. Default: [3, 3].
`pord`	vector with the order of differences for each axis. Default: [2, 2].
`lambdas`	vector with smoothing parameters, possibly one for axis (optional).
`df`	a number which specifies the degrees of freedom (optional).
`method`	the method for controlling the amount of smoothing. `method = 1` (default) adjusts the two smoothing parameters so that the BIC is minimized. `method = 2` adjusts `lambdas` so that the AIC is minimized. `method = 3` uses the values supplied for `lambdas`. Isotropic smoothing is allowed in this method. `method = 4` adjusts `lambdas` so that the degrees of freedom is equal to the supplied `df`.
`coefstart`	an optional matrix of starting coefficients.
`control`	a list of control parameters. See `Details`.

The method fits a two-dimensional P-spline model with equally-spaced B-splines along x and y. The response variables must be a matrix of Poisson distributed counts. Offset can be provided, otherwise the default is that all weights are one.

The function is specifically tailored to smooth mortality data in one-dimensional setting. In such case the argument x would be the ages and the argument y the years under study. The matrix of death counts will be the argument Z. In a Poisson regression setting applied to actual death counts the offset will be the logarithm of the matrix of exposure population. See example below.

The function can obviously account for zero counts and definite offset. In a mortality context, the user can apply the function to data with zero deaths, but it has to take care that no exposures are equal to zero, i.e. offset equal to minus infinitive. In this last case, the argument W can help. The user would need to set weights equal to zero when exposures are equal to zero leading to interpolation of the data. See example below.

Regardless the presence of exposures equal to zero, the argument W can also be used for extrapolation and interpolation of the data. Nevertheless see the function predict.Mort2Dsmooth for a more comprehensive way to forecast mortality rates over ages and years.

The method produces results from a smoothing function which is the Kronecker product of B-spline basis over the two axes and include a discrete penalization directly on the differences of the B-splines coefficients. The user can set the order of difference, the degree of the B-splines and number of them for each of the axis. Nevertheless, the smoothing parameters lambdas are mainly used to tune the smoothness/model fidelity of the fitted values.

The ranges in which lambda is searched is given in control - RANGEx and RANGEy. Though they can be modified, the default values are suitable for most of the application.

There are print.Mort2Dsmooth, summary.Mort2Dsmooth, plot.Mort2Dsmooth predict.Mort2Dsmooth and residuals.Mort2Dsmooth methods available for this function.

Four methods for optimizing the smoothing parameters are available. The BIC is set as default. Minimization of the AIC is also possible. BIC will give always smoother outcomes with respect to AIC, especially for large sample size. Alternatively the user can directly provide the smoothing parameters (method=3) or the degrees of freedom to be used in the model (method=4). In this last case isotropic smoothing (same smoothing parameter over x and y) is employed. If the user provides only a single value for the argument lambdas, isotropic smoothing is applied (with warning). Note that Mort2Dsmooth uses approximated degrees of freedom, therefore method=4 will produce fitted values with degree of freedom only similar to the one provided in df. The tolerance level can be set via control - TOL2.

Note that the two-dimensional 'ultimate' smoothing with very large lambda will approach to a surface which is a product of two polynomial of degree pord[1] and pord[2], respectively. In particular, when pord=c(2,2) the 'ultimate' smoothing is a bi-linear surface over x and y.

The argument overdispersion can be set to TRUE when smoothing parameters selection has to account for possible presence of over(under)dispersion. Mortality data often present overdispersion also known, in demography, as heterogeneity. Duplicates in insurance data can lead to overdispersed data, too. Smoothing parameters selection may be affected by this phenomenon. When overdispersion=TRUE, the function uses a penalized quasi-likelihood method for including an overdisperion parameter (psi2) in the fitting procedure. With this approach expected values are assumed equal to the variance multiplied by the parameter psi2. See references. Note that with overdispersed data both BIC and AIC might select higher lambdas, leading to smoother outcomes. When overdispersion=FALSE (default value) or method=3 or method=4, psi2 is estimated after the smoothing parameters have been employed. Overdispersion parameter larger (smaller) than 1 may be a sign of overdispersion (underdispersion).

The control argument is a list that can supply any of the following components:

MON: Logical. If TRUE tracing information on the progress of the fitting is produced. Default: FALSE.

TOL1: The absolute convergence tolerance for each completed scoring algorithm. Default: 1e-06.

TOL2: Difference between two adjacent smoothing parameters in the (pseudo) grid search, log-scale. Useful only when method is equal to 1, 2 or 4. Default: 0.5.

RANGEx: Range of smoothing parameters over x in which the grid-search is applied, commonly taken in log-scale. Default: [10^-4 ; 10^6].

RANGEy: Range of smoothing parameters over y in which the grid-search is applied, commonly taken in log-scale. Default: [10^-4 ; 10^6].

MAX.IT: The maximum number of iterations for each completed scoring algorithm. Default: 50.

The arguments MON, TOL1 and MAX.IT are kept during all the (pseudo) grid search when method is equal to 1, 2 or 4. Function cleversearch from package svcm is employed to speed the grid search. See Mort2Dsmooth_optimize for details.

The inner functions work using an arithmetic of arrays defined as Generalized Linear Array Model (GLAM) (see references). In order to avoid construction of large Kronecker product basis from the large number of B-splines along the axes, the function profits of the special structure of both the data as rectangular array and the model matrix as tensor product. It uses sequence of nested matrix operations and this leads to low storage and high speed computation within the IWLS algorithm. Moreover, the function do not vectorize the whole system keeping the actual two-dimensional array structure within the scoring algorithm.

An object of the class Mort2Dsmooth with components:

`coefficients`	matrix of fitted (penalized) B-splines coefficients.
`residuals`	matrix of deviance residuals.
`fitted.values`	matrix of fitted counts.
`linear.predictor`	matrix of fitted linear predictor.
`logmortality`	fitted mortality rates in log-scale.
`df`	effective dimension.
`dev`	Poisson Deviance.
`aic`	Akaike's Information Criterion.
`bic`	Bayesian Information Criterion.
`psi2`	Overdispersion parameter.
`lambdas`	the selected (given) smoothing parameters.
`call`	the matched call.
`m`	length of argument `x`.
`n`	length of argument `y`.
`tolerance`	the used tolerance level.
`lev`	diagonal of the hat-matrix.
`ndx`	the number of internal knots -1 for each axis.
`deg`	degree of the B-splines for each axis.
`pord`	order of difference for each axis.
`x`	vector for the abscissa of data.
`y`	vector for the ordinate of data.
`Z`	matrix of counts response.
`offset`	matrix with an a priori known component.
`W`	matrix of weights.
`Bx`	the B-splines basis over `x`.
`By`	the B-splines basis over `y`.

Carlo G Camarda

Camarda, C. G. (2012). MortalitySmooth: An R Package for Smoothing Poisson Counts with P-Splines. Journal of Statistical Software. 50, 1-24. http://www.jstatsoft.org/v50/i01/.

Currie, I. D., M. Durban, and P. H. C. Eilers (2006). Generalized linear array models with applications to multidimentional smoothing. Journal of the Royal Statistical Society. Series B. 68, 259-280.

Eilers, P. H. C., I. D. Currie, and M. Durban (2006). Fast and compact smoothing on large multidimensional grids. Computational Statistics & Data Analysis. 50, 61-76.

predict.Mort2Dsmooth, plot.Mort2Dsmooth.

## selected data
ages <- 50:100
years <- 1950:2006
death <- selectHMDdata("Sweden", "Deaths", "Females",
                       ages = ages, years = years) 
exposure <- selectHMDdata("Sweden", "Exposures", "Females",
                          ages = ages, years = years)

## fit with BIC
fitBIC <- Mort2Dsmooth(x=ages, y=years, Z=death,
                       offset=log(exposure)) 
fitBIC
summary(fitBIC)

## plot age 50 log death rates (1st row)
plot(years, log(death[1,]/exposure[1,]),
     main="Mortality rates, log-scale.
           Swedish females, age 50, 1950:2006")
lines(years, fitBIC$logmortality[1,], col=2, lwd=2)

## plot over age and years
## fitted log death rates from fitBIC
grid. <- expand.grid(list(ages=ages, years=years))
grid.$lmx <- c(fitBIC$logmortality)
levelplot(lmx ~ years * ages , grid., 
          at=quantile(grid.$lmx, seq(0,1,length=10)),
          col.regions=rainbow(9))

## see vignettes for examples on
## - Extra-Poisson variation
## - interpolation
## - extrapolation

Loading required package: svcm
Loading required package: Matrix
Loading required package: splines
Loading required package: lattice
Call:
Mort2Dsmooth(x = ages, y = years, Z = death, offset = log(exposure))

Number of Observations              : 2907 
                    of which x-axis : 51 
                             y-axis : 57 
Effective dimension                 : 47.1391 
(Selected) smoothing parameters       
                         over x-axis: 1000 
                         over y-axis: 316.23 
Bayesian Information Criterion (BIC): 4118.3
Call:
Mort2Dsmooth(x = ages, y = years, Z = death, offset = log(exposure))

Number of Observations                      : 2907 
                            of which x-axis : 51 
                                     y-axis : 57 
Effective dimension                         : 47.14 
(Selected) smoothing parameters               
                                 over x-axis: 1000 
                                 over y-axis: 316.23 
Bayesian Information Criterion (BIC)        : 4118.3 
Akaike's Information Criterion (AIC)        : 3836.6 
(Estimated) overdispersion parameter (psi^2): 1.31 

Residuals:
     Min       1Q   Median       3Q      Max 
-4.49794 -0.80886 -0.05639  0.73204  4.02792 

Settings and control:
  x-axis: 13 B-splines of degree 3 - differences of order 2 
  y-axis: 14 B-splines of degree 3 - differences of order 2 
  convergence tolerance : 1.7056e-12