linearERR: Fit linear ERR model and perform jackknife correction

Description Usage Arguments Details Value Examples

View source: R/linearERR.R

Description

Fits the linear ERR model on matched case-control data and performs first and second order jackknife correction

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
linearERR(
  data,
  doses,
  set,
  status,
  loc,
  corrvars = NULL,
  ccmethod = "CCAL",
  repar = FALSE,
  initpars = rep(0, length(doses) + length(corrvars)),
  fitopt = NULL,
  uplimBeta = 5,
  profCI = TRUE,
  doJK1 = FALSE,
  doJK2 = FALSE,
  jkscorethresh = 0.01,
  jkvalrange = c(-Inf, Inf)
)

Arguments

data

data frame containing matched case-control data, with a number of columns for doses to different locations, a column containing matched set numbers, a column containing the case's tumor location (value between 1 and the number of locations, with location x corresponding to the x-th column index in doses) and a column serving as a case-control indicator. Other covariates can also be included, in this case a parameter for each covariate column will be estimated. Hence factor variables need to be converted to dummy variables using model.matrix. If using ccmethod='meandose', a column for tumor location is still required but in this case the column can be a vector of ones.

doses

vector containing the indices of columns containing dose information.

set

column index containing matched set numbers.

status

column index containing case status.

loc

column index containing the location of the matched set's case's second tumor.

corrvars

vector containing the indices of columns containing variables to be corrected for.

ccmethod

choice of method of analysis: one of meandose, CCML, CCAL or CL. Defaults to CCAL

repar

reparametrize to β=exp(ξ)? Defaults to FALSE

initpars

initial values for parameters, default is 0 for all parameters. If supplying a different vector, use a vector with an initial value for β or ξ, one for all of the other location effects and one for each other covariate (in that order). Note that if repar=TRUE, the initial value is used for ξ.

fitopt

list with options to pass to control argument of optimizer (see details)

uplimBeta

upper limit for β=exp(ξ), default value 5. This is used for constraining the MLE estimation in some settings and for the jackknife inclusion criteria, and can be infinite except when Brent optimization is used (see details)

profCI

boolean: compute 95% profile likelihood confidence interval for β/ξ? Default value TRUE.

doJK1

perform first order jackknife correction? Automatically set to TRUE when doJK2=TRUE. Caution: this can take a long time to run. Default value FALSE

doJK2

perform second order jackknife correction? Caution: this can take a very long time to run. Default value FALSE

jkscorethresh

square L2 norm threshold for leave-one-out and leave-two-out estimates to be included in the computation of the first and second order jackknife corrected estimate, respectively

jkvalrange

range of leave-one-out and leave-two-out beta/xi estimates to be allowed in the computation of the first and second order jackknife corrected estimate, respectively

Details

This is the main function of the package, used for fitting the linear ERR model in matched case-control data. Use this function to estimate the MLE (including a profile likelihood confidence interval for the dose effect) and to perform first and second order jackknife corrections.

The model being fit is HR=∑(1+β d_l)exp(α_l+X^Tγ), where the sum is over organ locations. Here β is the dose effect, α are the location effects and γ are other covariate effects. The model can be reparametrized to HR=∑(1+exp(ξ) d_l)exp(α_l+X^Tγ) using repar=TRUE. In the original parametrization, β is constrained such that HR cannot be negative. There are different choices for the design used to estimate the parameters: mean organ dose, CCML, CL, and CCAL. Mean organ dose (ccmethod='meandose') uses the mean of the supplied location doses and compares that mean dose between case and matched controls. The other choices (CCML, CL and CCAL) use the tumor location for the case and compare either only between patients (CCML), only within patients (CL) or both between and within patients (CCAL). CCML only compares the same location between patients, and hence cannot be used to estimate location effects. Similarly, CL compares within patients and cannot be used to estimate covariate effects other than dose, meaning corrvars should not be supplied for CL.

For one-dimensional models (i.e., mean dose or CCML without additional covariates), the Brent algorithm is used with a search interval (-10,log(uplimBeta)) when repar=TRUE and (L,uplimBeta) otherwise, where L is determined by the positivity constraint for HR. For other optimizations, the L-BFGS-B algorithm (with constraint uplimBeta) is used when repar=FALSE, and the unconstrained Nelder-Mead is used when repar=TRUE. For details refer to the function optim, also for fitopt settings. Note that when supplying ndeps to fitopt, a value needs to be specified for every free parameter in the model. For more flexibility in optimizion, use linERRloglik and optimize directly.

The jackknife procedure allows for filtering of the leave-one-out and leave-two-out estimates, which is important as the model can be unstable and produce extreme estimates. All estimates reaching the maximum number of iterations are excluded, as well as estimates larger than uplimBeta (if applicable). Further, the user can set a threshold for the square L2 norm of the score for an estimate (default .01), as well as an allowed value range for the β/ξ estimate itself. When the jackknife is run, the output object contains an element details, allowing the user to inspect the produced leave-one-out and leave-two-out estimates.

Value

Object with components MLE and jackknife. MLE has components:

coef

estimated model coefficients

sd

estimated standard deviation for all coefficient estimates

vcov

variance-covariance matrix for all estimates

score

score in the MLE

convergence

convergence code produced by the optimizer (for details refer to optim)

message

convergence message produced by the optimizer

dosepval

p-value for the LRT comparing the produced model with a model without dose effect. Note that the null model this is based on uses the same optimization algorithm used for the MLE, meaning one-dimensional Nelder-Mead is used when repar=TRUE and the full model has 2 free parameters (see details)

profCI

the 95% profile likelihood confidence interval. In some cases one or both of the bounds of the CI cannot be obtained automatically. In that case, it is possible to use the proflik function that is an output of linearERRfit directly. Note: the same optimization algorithm that was used for the MLE will be used, even if this model only has one parameter (see details)

fitobj

Fit object produced by linearERRfit

jackknife has components firstorder and secondorder. Both of these have components:

coef

the jackknife-corrected coefficient estimates

details

data frame with information on leave-one-out or leave-two-out estimates, with columns:

  • set or set1 and set2, the left-out set(s)

  • included, a 0/1 variable indicating whether this row was used to produce the corrected estimate

  • conv, convergence code for each model produced by the optimizer

  • coef, the leave-one-out or leave-two-out coefficient estimates

  • score, the score in the leave-one-out or leave-two-out estimate

Note that the details for the second order jackknife only include leave-two-out estimates. To access leave-one-out estimates, use details for the first order jackknife.

Examples

1
2
3
4
5
6
7
data(linearERRdata1)

fitCCML <- linearERR(data=linearERRdata1, set=1, doses=2:6, status=8,
loc=7, corrvars=9, repar=FALSE, ccmethod="CCML", doJK1=TRUE)

fitCCML$MLE$coef
fitCCML$jackknife$firstorder$coef

sanderroberti/linearERRfit documentation built on Nov. 8, 2021, 12:23 a.m.