ci.cum: Compute cumulative sum of estimates.
In Epi: Statistical Analysis in Epidemiology

ci.cum

R Documentation

Compute cumulative sum of estimates.

Description

Computes the cumulative sum of parameter functions and the standard error of it. Used for computing the cumulative rate, or the survival function based on a glm with parametric baseline.

Usage

ci.cum( obj,
    ctr.mat = NULL,
     subset = NULL,
       intl = NULL,
      alpha = 0.05,
        Exp = TRUE,
     ci.Exp = FALSE,
     sample = FALSE )
ci.surv( obj,
    ctr.mat = NULL,
     subset = NULL,
       intl = NULL,
      alpha = 0.05,
        Exp = TRUE,
     sample = FALSE )

Arguments

`obj`	A model object (of class `lm`, `glm`.
`ctr.mat`	Matrix or data frame. If `ctr.mat` is a matrix, it should be a contrast matrix to be multiplied to the parameter vector, i.e. the desired linear function of the parameters. If it is a data frame it should have columns corresponding to a prediction data frame for `obj`, see details for `ci.lin`.
`subset`	Subset of the parameters of the model to which a matrix `ctr.mat` should be applied.
`intl`	Interval length for the cumulation. Either a constant or a numerical vector of length `nrow(ctr.mat)`. If omitted taken as the difference between the two first elments if the first column in `ctr.mat`, assuming that that holds the time scale. A note is issued in this case.
`alpha`	Significance level used when computing confidence limits.
`Exp`	Should the parameter function be exponentiated before it is cumulated?
`ci.Exp`	Should the confidence limits for the cumulative rate be computed on the log-scale, thus ensuring that exp(-cum.rate) is always in [0,1]?
`sample`	Should a sample of the original parameters be used to compute a cumulative rate?

Details

The purpose of ci.cum is to the compute cumulative rate (integrated intensity) at a set of points based on a model for the rates. ctr.mat is a matrix which, when premultiplied to the parameters of the model returns the (log)rates at a set of equidistant time points. If log-rates are returned from the prediction method for the model, the they should be exponentiated before cumulated, and the variances computed accordingly. Since the primary use is for log-linear Poisson models the Exp parameter defaults to TRUE.

Each row in the object supplied via ctr.mat is assumed to represent a midpoint of an interval. ci.cum will then return the cumulative rates at the end of these intervals. ci.surv will return the survival probability at the start of each of these intervals, assuming the the first interval starts at 0 - the first row of the result is c(1,1,1).

The ci.Exp argument ensures that the confidence intervals for the cumulative rates are always positive, so that exp(-cum.rate) is always in [0,1].

Value

A matrix with 3 columns: Estimate, lower and upper c.i. If sample is TRUE, a single sampled vector is returned, if sample is numeric a matrix with sample columns is returned, each column a cumulative rate based on a random sample from the distribution of the parameter estimates.

ci.surv returns a 3 column matrix with estimate, lower and upper confidence bounds for the survival function.

Author(s)

Bendix Carstensen, http://bendixcarstensen.com

Examples

# Packages required for this example
library( splines )
library( survival )
data( lung )
par( mfrow=c(1,2) )

# Plot the Kaplan-meier-estimator
plot( survfit( Surv( time, status==2 ) ~ 1, data=lung ) )

# Declare data as Lexis
lungL <- Lexis(exit = list(tfd=time),
               exit.status = (status == 2) * 1,
               data = lung)
summary(lungL)

# Split the follow-up every 10 days
sL <- splitLexis(lungL, "tfd", breaks=seq(0,1100,10))
summary(sL)

# Fit a Poisson model with a natural spline for the effect of time (left
# end points of intervals are used as covariate)
mp <- glm(cbind(lex.Xst == 1, lex.dur)
          ~ Ns(tfd,knots = c(0, 50, 100, 200, 400, 700)),
          family = poisreg,
            data = sL)

# mp is now a model for the rates along the time scale tfd
# prediction data frame for select time points on this time scale
nd <- data.frame(tfd = seq(5,995,10)) # *midpoints* of intervals
Lambda <- ci.cum ( mp, nd, intl=10 )
surv   <- ci.surv( mp, nd, intl=10 )

# Put the estimated survival function on top of the KM-estimator
# recall the ci.surv return the survival at *start* of intervals
matshade(nd$tfd - 5, surv, col = "Red", alpha = 0.15)

# Extract and plot the fitted intensity function
lambda <- ci.pred(mp, nd) * 365.25 # mortality 
matshade(nd$tfd, lambda, log = "y", ylim = c(0.2, 5), plot = TRUE,
          xlab = "Time since diagnosis",
          ylab = "Mortality per year")

# same thing works with gam from mgcv
library(mgcv)
mg <- gam(cbind(lex.Xst == 1, lex.dur) ~ s(tfd), family = poisreg, data=sL )
matshade(nd$tfd - 5, ci.surv(mg, nd, intl=10), plot=TRUE,
         xlab = "Days since diagnosis", ylab="P(survival)")
matshade(nd$tfd  , ci.pred(mg, nd) * 365.25, plot=TRUE, log="y",
         xlab = "Days since diagnosis", ylab="Mortality per 1 py")

Epi documentation built on April 3, 2025, 9:42 p.m.