eventdepenobs: SCCS with event-dependent observation periods

View source: R/eventdepenobs.R

eventdepenobsR Documentation

SCCS with event-dependent observation periods


One of the assumptions of the self-controlled case series models is that the observation period for each individual is independent of event times. If an event increases the risk of death, such as myocardial infraction or stroke, this assumption is violated. This function fits the modified SCCS model when the assumption is not satisfied i.e ages at end of observation periods might depend on age at event as outlined in Farrington et al (2011).


eventdepenobs(formula, indiv, astart, aend, aevent, adrug, aedrug, censor,
              expogrp = list(), washout = list(), sameexpopar = list(), 
              agegrp = NULL, dataformat="stack", covariates=NULL, 
              regress=F, initval=rep(0.1, 7), data)



a model formula. The dependent variable should always be "event" e.g. event ~ itp. If age effects are included in the model, the word 'age' must be used in the formula, e.g event ~ itp + age.


a vector of individual identifiers of cases


a vector of ages at which the observation periods start


a vector of ages at end of observation periods


a vector of ages at event (one event per case)


a list of vectors of ages at start of exposures or a list of matrices if the exposures have multiple episodes (dataformat multi). Multiple exposures of the same type can be recorded as multiple rows (dataformat stack). One list item per exposure type.


a list of vectors of ages at which exposure-related risk ends or a list of matrices if there are multiple episodes (repeat exposures in different columns) of the same exposure type. The dimension of each item of aedrug has to be equal to that of adrug, that is aedrug should be given for each exposure in adrug.


a vector of indicators for whether an observation periods were censored (1 = observation period ended early, 0 = fully observed).


list of vectors of days to the start of exposure-related risk, counted from adrug. E.g if the risk period is [adrug+c,aedrug], use expogrp = list(c) or expogrp = c. For multiple exposure types expogrp is a list of vectors having the same length as list adrug. The DEFAULT is a list of zeros where the exposure-related risk periods are [adrug, aedrug].


list of vectors with days on start of washout periods counted from aedrug, the number of vectors in the list is equal to the number of exposures or the length of list of adrug. The default is NULL, no washout periods. The order of the list corresponds to the order of exposures in adrug.


a vector of logical values. If TRUE (the default) no dose effect is assumed, the same exposure parameters are used for multiple doses/episodes of the same exposure type presented in dataformat 'multi'. If FALSE different relative incidences are estimated for different doses/episodes of the same exposure type. The length of the vector is equal to the length of the list adrug.


a vector of cut points for the age groups where each value represents the start of an age catagory. The first element in the vector is the start of the second age group. The first age group starts at the minimum of astart, the start of observation period. The defaults is NULL (i.e no age effects included).


the way the input data are assembled. It accepts "multi" or "stack" (the default), where "multi" refers to a data assembled with one row representing one event and "stack" refers to a data frame where repeated exposures of the same exposure type are stacked in one column. In the "multi" dataformat different episodes of the same exposure type are recorded as separate columns in the dataframe.


list of covariates believed to affect the age at censoring (age at end of observation period) (e.g. covariates = gender).


logical, regress=T indicates that the parameters of the weight functions are regressed againt age at event or age at start of observation. The default is regress=F


a vector of intial values used in fitting the weight functions. These are given in the order of: 1. Log mean of the exponential component 2. Intercept of the EG/EW log mean function 3. Intercept of the EG/EW log shape function 4. Intercept of the logit mixing probability function 5. Regression parameter of the G/W log mean functions, if regress = T 6. Regression parameter of the G/W log shape function, if regress=T 7. Regression parameter of the G/W logit mixing probability function, if regress=T. When regress=F only the first 4 are used. The default initval values are 0.1.


a data frame containing the input data. The data should be in 'stack' or 'multi' (see dataformat).


This model is suitable when the event increases the risk of death, such as myocardial infraction (MI) or stroke. It is not suitable when the event itself is death. Four models are fitted to the interval between the age at end of observation and the event date, these are detailed in section 5.4 of Farrington et al (2011). The model with the lowest AIC is selected, and used to estimate weights that replace interval lengths in the model formula. This modification allows unbiased estimates of the exposure effect to be estimated, while age effects take on a different interpretation as they include the thinning effect of censoring.



exposure related relative incidence estimates along with their 95% confidence intervals, age related relative incidence estimates and estimates of interactions with covariates if there are any.


model fit of the 4 different weight functions and their AIC values.


Yonas Ghebremichael-Weldeselassie, Heather Whitaker, Paddy Farrington.


Farrington, C. P., Anaya-Izquierdo, A., Whitaker, H. J., Hocine, M.N., Douglas, I., and Smeeth, L. (2011). Self-Controlled case series analysis With event-Dependent observation periods. Journal of the American Statistical Association 106 (494), 417–426.

Farrington P., Whitaker H., and Ghebremichael-Weldeselassie Y. (2018). Self-controlled Case Series Studies: A modelling Guide with R. Boca Raton: Chapman & Hall/CRC Press.


 # Nicotine replacement therapy and myocardial infarction (MI)
 # With no age effect included

 nrt.mod <- eventdepenobs(event~nrt, indiv=case, astart=nrt,
             aend=act, aevent=mi, adrug=nrt, aedrug=nrt+28,
             censor=cen, expogrp=c(0,8,15,22), agegrp=NULL,
 # Respiratory tract infections and MI
 # Age effect included
 # intial values provided and there are two risk periods
 uni <- (1-duplicated(midat$case))
 ageq <- floor(quantile(midat$mi[uni==1], seq(0.1,0.9,0.1), names=FALSE))
                   # age groups
 mi.mod <- eventdepenobs(event~rti+age, indiv=case, astart=sta,
                         aend=end, aevent=mi, adrug=rti, aedrug=rti+14,
                         expogrp=c(0,8), agegrp=ageq, censor=cen, data=midat,

SCCS documentation built on July 5, 2022, 5:05 p.m.