predSNS | R Documentation |
Allows the prediction of population and age-standardized (net) survival as well as associated confidence intervals
predSNS(
model,
time.points,
newdata,
weight.table,
var.name,
var.model,
conf.int = 0.95,
method = "exact",
n.legendre = 50
)
model |
a fitted |
time.points |
vector of follow-up values |
newdata |
dataset containing the original age values used for fitting |
weight.table |
dataset containing the age classes used for standardization, must be in the same format as the elements of the following list |
var.name |
list containing one element : the column name in newdata that reports age values. This element should be named after the age variable present in the model formula. Typically, if newdata contains an 'age' column while the model uses a centered age 'agec', the list should be: list(agec="age") |
var.model |
list containing one element : the function that allows retrieving the age variable used in model formula from original age. Typically for age centered on 50, list(agec=function(age) age - 50) |
conf.int |
numeric value giving the precision of the confidence intervals; default is 0.95 |
method |
should be either 'exact' or 'approx'. The 'exact' method uses all age values in newdata for predictions. The 'approx' method uses either newdata$age (if age values are whole numbers) or floor(newdata$age) + 0.5 (if age values are not whole numbers) and then removes duplicates to reduce computational cost. |
n.legendre |
number of nodes to approximate the cumulative hazard by Gauss-Legendre quadrature; default is 50 |
The weight table used should always be in the same format as elements of list.wicss
.
Only age-standardization is possible for now. All other variables necessary for model predictions should be fixed to a single value.
For simplicity, in what follows we will consider that survival only depends on time and age.
List of nine elements
class.table |
Number of individuals in each age class |
SNS |
Vector of predicted age-standardized (net) survival |
SNS.inf |
Lower bound of confidence intervals associated with predicted age-standardized (net) survival |
SNS.sup |
Upper bound of confidence intervals associated with predicted age-standardized (net) survival |
PNS |
Vector of predicted population (net) survival |
PNS.inf |
Lower bound of confidence intervals associated with predicted population (net) survival |
PNS.sup |
Upper bound of confidence intervals associated with predicted population (net) survival |
PNS_per_class |
matrix of predicted population (net) survival in each age class |
PNS_per_class.inf |
Lower bound of confidence intervals associated with predicted population (net) survival in each age class |
PNS_per_class.sup |
Upper bound of confidence intervals associated with predicted population (net) survival in each age class |
For a given group of individuals, PNS at time t is defined as
PNS(t)=\sum_i 1/n*S_i(t,a_i)
where a_i
is the age of individual i
SNS at time t is defined as
SNS(t)=\sum_i w_i*S_i(t,a_i)
where a_i
is the age of individual i
and w_i=w_{ref j(i)}/n_{j(i)}
.
w_{ref j(i)}
is the weigth of age class j
in the reference population (it corresponds to weight.table$AgeWeights).
Where n_{j(i)}
is the total number of individuals present in age class j(i)
: the age class of individual i
.
For large datasets, SNS calculation is quite heavy. To reduce computational cost, the idea is to regroup individuals who have similar age values. By using floor(age) + 0.5 instead of age, the gain will be substantial while the prediction error will be minimal (method="approx" will give slightly different predictions compared to method="exact"). Of course, if the provided age values are whole numbers then said provided age values will be used directly for grouping and there will be no prediction error (method="approx" and method="exact" will give the exact same predictions).
SNS(t)=\sum_a \tilde{w}_a*S(t,a)
The sum is here calculated over all possible values of age instead of all individuals.
We have \tilde{w}_a=n_a*w_{ref j(a)}/n_{j(a)}
.
Where j(a)
is the age class of age a
while n_a
is the number of individuals with age a
.
Confidence intervals for SNS are derived assuming normality of log(log(-SNS)) Lower and upper bound are given by
IC_{95\%}(SNS)=[SNS^{1.96*\sqrt(Var(Log(Delta_{SNS})))};SNS^{-1.96*\sqrt(Var(Log(Delta_{SNS})))}]
with
Delta_{SNS}=-log(SNS)
Var(Log(Delta_{SNS}))
is derived by Delta method.
Confidence intervals for PNS are derived in the exact same way.
Corazziari, I., Quinn, M., & Capocaccia, R. (2004). Standard cancer patient population for age standardising survival ratios. European journal of cancer (Oxford, England : 1990), 40(15), 2307–2316. https://doi.org/10.1016/j.ejca.2004.07.002.
data(datCancer)
data(list.wicss)
don <- datCancer
don$agec <- don$age - 50 # using centered age for modelling
#-------------------- model with time and age
knots.t<-quantile(don$fu[don$dead==1],probs=seq(0,1,length=6)) # knots for time
knots.agec<-quantile(don$agec[don$dead==1],probs=seq(0,1,length=5)) # knots for age
formula <- as.formula(~tensor(fu,agec,df=c(length(knots.t),length(knots.agec)),
knots=list(fu=knots.t,age=knots.agec)))
mod <- survPen(formula,data=don,t1=fu,event=dead,n.legendre=20, expected=rate)
#-------------------- Age classes and associated weights for age-standardized
# net survival prediction
# weights of type 1
wicss <- list.wicss[["1"]]
# to estimate population net survival, prediction dataframe
# is needed. It should contain original data for age
pred.pop <- data.frame(age=don$age)
#-------------------- prediction : age-standardized net survival and population net survival
pred <- predSNS(mod,time.points=seq(0,5,by=0.1),newdata=pred.pop,
weight.table=wicss,var.name=list(agec="age"),
var.model=list(agec=function(age) age - 50),method="approx")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.