Natural splines - (cubic splines linear beyond outermost knots) with convenient specification of knots and possibility of centering, detrending and clamping.

Share:

Description

This function is partly for convenient specification of natural splines in practical modeling. The convention used is to take the smallest and the largest of the supplied knots as boundary knots. It also has the option of centering the effects provided at a chosen reference point as well as projecting the columns on the orthogonal space to that spanned by the intercept and the linear effect of the variable, and finally fixing slopes beyond boundary knots (clamping).

Usage

1
2
3
4
5
6
Ns( x, ref = NULL, df = NULL,
                knots = NULL,
            intercept = FALSE,
       Boundary.knots = NULL,
                fixsl = c(FALSE,FALSE),
              detrend = FALSE )

Arguments

x

A variable.

ref

Scalar. Reference point on the x-scale, where the resulting effect will be 0.

df

degrees of freedom.

knots

knots to be used both as boundary and internal knots. If Boundary.knots are given, this will be taken as the set of internal knots.

intercept

Should the intercept be included in the resulting basis? Ignored if any of ref or detrend is given.

Boundary.knots

The boundary knots beyond which the spline is linear. Defaults to the minimum and maximum of knots.

fixsl

Specification of whether slopes beyond outer knots should be fixed to 0. FALSE correponds to no restriction; a curve with 0 slope beyond the upper knot is obtained using c(FALSE,TRUE). Ignored if !(detrend==FALSE).

detrend

If TRUE, the columns of the spline basis will be projected to the orthogonal of cbind(1,x). Optionally detrend can be given as a vector of non-negative numbers used to define an inner product as diag(detrend) for projection on the orthogonal to cbind(1,x). The default is projection w.r.t. the inner product defined by the identity matrix.

Value

A matrix of dimension c(length(x),df) where either df was supplied or if knots were supplied, df = length(knots) - intercept. Ns returns a spline basis which is centered at ref. Ns with the argument detrend=TRUE returns a spline basis which is orthogonal to cbind(1,x) with respect to the inner product defined by the positive definite matrix diag(weight) (an assumption which is checked).

Note

The need for this function is primarily from analysis of rates in epidemiology and demography, where the dataset are time-split records of follow-up, and the range of data therefore rarely is of any interest (let alone meaningful).

In Poisson modeling of rates based on time-split records one should aim at having the same number of events between knots, rather than the same number of observations.

Author(s)

Bendix Carstensen bxc@steno.dk, Lars Jorge D\'iaz, Steno Diabetes Center.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
require(splines)
require(stats)
require(graphics)

ns( women$height, df = 3)
Ns( women$height, knots=c(63,59,71,67) )

# Gives the same results as ns:
summary( lm(weight ~ ns(height, df = 3), data = women) )
summary( lm(weight ~ Ns(height, df = 3), data = women) )

# Get the diabetes data and set up as Lexis object
data(DMlate)
DMlate <- DMlate[sample(1:nrow(DMlate),500),]
dml <- Lexis( entry = list(Per=dodm, Age=dodm-dobth, DMdur=0 ),
               exit = list(Per=dox),
        exit.status = factor(!is.na(dodth),labels=c("DM","Dead")),
               data = DMlate )

# Split follow-up in 1-year age intervals
dms <- splitLexis( dml, time.scale="Age", breaks=0:100 )
summary( dms )

# Model  age-specific rates using Ns with 6 knots
# and period-specific RRs around 2000 with 4 knots
# with the same number of deaths between each pair of knots
n.kn <- 6
( a.kn <- with( subset(dms,lex.Xst=="Dead"),
                quantile( Age+lex.dur, probs=(1:n.kn-0.5)/n.kn ) ) )
n.kn <- 4
( p.kn <- with( subset( dms, lex.Xst=="Dead" ),
                quantile( Per+lex.dur, probs=(1:n.kn-0.5)/n.kn ) ) )
m1 <- glm( lex.Xst=="Dead" ~ Ns( Age, kn=a.kn ) +
                             Ns( Per, kn=p.kn, ref=2000 ),
           offset = log( lex.dur ),
           family = poisson,
             data = dms )

# Plot estimated age-mortality curve for the year 2005 and knots chosen:
nd <- data.frame( Age=seq(40,100,0.1), Per=2005, lex.dur=1000 )
par( mfrow=c(1,2) )
matplot( nd$Age, ci.pred( m1, newdata=nd ),
         type="l", lwd=c(3,1,1), lty=1, col="black", log="y",
         ylab="Mortality rates per 1000 PY", xlab="Age (years)", las=1, ylim=c(1,1000) )
rug( a.kn, lwd=2 )

# Clamped Age effect to the right of rightmost knot.
m1.c <- glm( lex.Xst=="Dead" ~ Ns( Age, kn=a.kn, fixsl=c(FALSE,TRUE) ) +
                               Ns( Per, kn=p.kn, ref=2000 ),
             offset = log( lex.dur ),
             family = poisson,
               data = dms )

# Plot estimated age-mortality curve for the year 2005 and knots chosen.
matplot( nd$Age, ci.pred( m1.c, newdata=nd ),
         type="l", lwd=c(3,1,1), lty=1, col="black", log="y",
         ylab="Mortality rates per 1000 PY", xlab="Age (years)", las=1, ylim=c(1,1000) )
rug( a.kn, lwd=2 )

par( mfrow=c(1,1) )

# Including a linear Age effect of 0.05 to the right of rightmost knot.
m1.l <- glm( lex.Xst=="Dead" ~ Ns( Age, kn=a.kn, fixsl=c(FALSE,TRUE) ) +
                               Ns( Per, kn=p.kn, ref=2000 ),
             offset = log( lex.dur ) + pmax( Age, max( a.kn ) ) * 0.05,
             family = poisson,
               data = dms )

# Plot estimated age-mortality curve for the year 2005 and knots chosen.
nd <- data.frame(Age=40:100,Per=2005,lex.dur=1000)
matplot( nd$Age, ci.pred( m1.l, newdata=nd ),
         type="l", lwd=c(3,1,1), lty=1, col="black", log="y",
         ylab="Mortality rates per 1000 PY", xlab="Age (years)", las=1, ylim=c(1,1000) )
rug( a.kn, lwd=2 )

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.