linearERRfirth: Derive the Firth-corrected estimate for the linear ERR model

Description Usage Arguments Details Value References Examples

View source: R/linearERRfirth.R

Description

Finds roots to the Firth-corrected score equations for the linear ERR model using a matched case-control study.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
linearERRfirth(
  data,
  doses,
  set,
  status,
  loc,
  corrvars = NULL,
  repar = FALSE,
  ccmethod = "CCAL",
  initpars = NULL,
  lowerlim = NULL,
  upperlim = NULL,
  fitopt = list(maxit = 5000)
)

Arguments

data

data frame containing matched case-control data, with a number of columns for doses to different locations, a column containing matched set numbers, a column containing the case's tumor location (value between 1 and the number of locations, with location x corresponding to the x-th column index in doses) and a column serving as a case-control indicator. Other covariates can also be included, in this case a parameter for each covariate column will be estimated. Hence factor variables need to be converted to dummy variables using model.matrix. If using ccmethod='meandose', a column for tumor location is still required but in this case the column can be a vector of ones.

doses

vector containing the indices of columns containing dose information.

set

column index containing matched set numbers.

status

column index containing case status.

loc

column index containing the location of the matched set's case's second tumor.

corrvars

vector containing the indices of columns containing variables to be corrected for. Not used with ccmethod='CL'

repar

reparametrize to β=exp(ξ)? It is recommended to reparametrize when using CL or CCAL or when using additional covariates. Defaults to FALSE

ccmethod

choice of method of analysis: one of meandose, CCML, CCAL or CL. Defaults to CCAL

initpars

initial values for parameters, default is 0 for all parameters. If supplying a different vector, use a vector with an initial value for all free parameters (β or ξ, one for each location effect (except the reference) when using CL or CCAL, and for each other covariate if applicable, in that order). Note that if repar=TRUE, the first initial value is used for ξ.

lowerlim

lower bound for model parameters, in the same order as initpars. At least one upper or lower limit needs to be finite. Note that when repar=TRUE, the first entry is the lower limit for ξ. When repar=FALSE, the lower limit for β cannot be smaller than -1/max(d) where the maximum is taken among all relevant doses for the chosen ccmethod. If this is the case, the limit will automatically be changed to that value

upperlim

upper bound for model parameters, in the same order as initpars. At least one upper or lower limit needs to be finite. Note that when repar=TRUE, the first entry is the upper limit for ξ. When repar=TRUE, if no other lower or upper limit is given as input, an upper limit of \log(5) will be used for ξ

fitopt

list with options to pass to control argument of optimizer (see details)

Details

This function looks for roots of the Firth-corrected score functions.

The underlying model is HR=∑(1+β d_l)exp(α_l+X^Tγ), where the sum is over organ locations. Here β is the dose effect, α are the location effects and γ are other covariate effects. The model can be reparametrized to HR=∑(1+exp(ξ) d_l)exp(α_l+X^Tγ) using repar=TRUE. In the original parametrization, β is constrained such that HR cannot be negative. There are different choices for the design used to estimate the parameters: mean organ dose, CCML, CL, and CCAL. Mean organ dose (ccmethod='meandose') uses the mean of the supplied location doses and compares that mean dose between case and matched controls. The other choices (CCML, CL and CCAL) use the tumor location for the case and compare either only between patients (CCML), only within patients (CL) or both between and within patients (CCAL). CCML only compares the same location between patients, and hence cannot be used to estimate location effects. Similarly, CL compares within patients and cannot be used to estimate covariate effects other than dose, meaning corrvars should not be supplied for CL. For this model, the Firth correction (Firth 1993) is used as a method for bias correction, or for obtaining an estimate when there is separation in the data.

To avoid using unstable multidimensional root finders, this function minimizes the square L2 norm of the modified score instead. This is done using the optim function. If desired, it is possible to use linERRscore and optimize or search for roots directly. For one-dimensional models (i.e., mean dose or CCML without additional covariates), the Brent algorithm is used with the user-supplied search interval (lowerlim,upperlim). Note that the choice for search interval is crucial as this determines convergence. For this reason, there is no default setting in this case. For other optimizations, the L-BFGS-B algorithm (with constraints lowerlim and upperlim) is used. For details refer to the function optim, also for fitopt settings. When repar=FALSE, if the lower bound for β is set too small, it is automatically changed according to the positivity constraint for HR.

It is advisable to interpret the results with caution. It was found that the modified score function sometimes has multiple roots, which makes setting initial values and search intervals crucial. It is recommended to try different settings for these inputs. Further, it seemed that reparametrizing improved the performance for multidimensional models.

Value

optim object with fit results.

References

David Firth, Bias reduction of maximum likelihood estimates, Biometrika, Volume 80, Issue 1, March 1993, Pages 27–38, https://doi.org/10.1093/biomet/80.1.27

Examples

1
2
3
4
5
6
7
8
9
data(linearERRdata1)

fitMLE <- linearERR(data=linearERRdata1,doses=2:6,set=1,status=8,loc=7,
corrvars=9,repar=TRUE,ccmethod="CCAL",profCI=FALSE)

fitfirth <- linearERRfirth(data=linearERRdata1,doses=2:6,set=1,status=8,loc=7,
corrvars=9,repar=TRUE,ccmethod="CCAL",initpars=fitMLE$MLE$coef)

data.frame(MLE=fitMLE$MLE$coef, Firth=fitfirth$par)

sanderroberti/linearERRfit documentation built on Nov. 8, 2021, 12:23 a.m.