qgcomp.cch.noboot: Quantile g-computation for survival outcomes in a case-cohort...
In qgcomp: Quantile G-Computation

qgcomp.cch.noboot

R Documentation

Quantile g-computation for survival outcomes in a case-cohort design under linearity/additivity

Description

This function performs quantile g-computation in a survival setting with case-cohort sampling. The approach estimates the covariate-conditional hazard ratio for a joint change of 1 quantile in each exposure variable specified in expnms parameter

Usage

qgcomp.cch.noboot(
  f,
  data,
  subcoh = NULL,
  id = NULL,
  cohort.size = NULL,
  expnms = NULL,
  q = 4,
  breaks = NULL,
  weights,
  cluster = NULL,
  alpha = 0.05,
  ...
)

Arguments

`f`	R style survival formula, which includes `Surv` in the outcome definition. E.g. `Surv(time,event) ~ exposure`. Offset terms can be included via `Surv(time,event) ~ exposure + offset(z)`
`data`	data frame
`subcoh`	(From `cch` help) Vector of indicators for subjects sampled as part of the sub-cohort. Code 1 or TRUE for members of the sub-cohort, 0 or FALSE for others. If data is a data frame then subcoh may be a one-sided formula.
`id`	(From `cch` help) Vector of unique identifiers, or formula specifying such a vector.
`cohort.size`	(From `cch` help) Vector with size of each stratum original cohort from which subcohort was sampled
`expnms`	character vector of exposures of interest
`q`	NULL or number of quantiles used to create quantile indicator variables representing the exposure variables. If NULL, then gcomp proceeds with un-transformed version of exposures in the input datasets (useful if data are already transformed, or for performing standard g-computation)
`breaks`	(optional) NULL, or a list of (equal length) numeric vectors that characterize the minimum value of each category for which to break up the variables named in expnms. This is an alternative to using 'q' to define cutpoints. See examples for how you might use this in case-cohort studies.
`weights`	Not used here (argument will be ignored)
`cluster`	Not used here (argument will be ignored)
`alpha`	alpha level for confidence limit calculation
`...`	arguments to `cch` (e.g. robust, method, stratum - see examples)

Details

For survival outcomes (as specified using methods from the survival package), this yields a conditional log hazard ratio representing a change in the expected conditional hazard (conditional on covariates) from increasing every exposure by 1 quantile. In general, this quantity quantity is not equivalent to marginal g-computation estimates. Hypothesis test statistics and 95% confidence intervals are based on using the delta method estimate variance of a linear combination of random variables.

Note that this closely follows the cch function in the survival package by Terry Therneau, and is restricted to the methods used in that function, which may not address all extant methods for case-cohort studies.

Value

a qgcompfit object, which contains information about the effect measure of interest (psi) and associated variance (var.psi), as well as information on the model fit (fit) and information on the weights/standardized coefficients in the positive (pos.weights) and negative (neg.weights) directions.

Examples

set.seed(50)
N=500
# cohort analysis
dat <- data.frame(id = 1:N, time=(tmg <- pmin(.1,rweibull(N, 10, 0.1))), 
                d=1.0*(tmg<0.1), x1=runif(N), x2=runif(N), z=rbinom(N, 1, 0.5))
expnms=paste0("x", 1:2)
f1 = survival::Surv(time, d)~x1 + x2 + z
(fit1 <- survival::coxph(f1, data = dat))
(obj <- qgcomp.cox.noboot(f1, expnms = expnms, data = dat))
f1s = survival::Surv(time, d)~x1 + x2 + strata(z)
(fit1s <- survival::coxph(f1s, data = dat))
(objs <- qgcomp.cox.noboot(f1s, expnms = expnms, data = dat))
#### now doing a case-cohort analysis
# 1) sampling simple case-cohort data
dat$subcohort = 1:nrow(dat) %in% sort(sample(1:nrow(dat), 100))
caco_dat = dat[dat$subcohort | dat$d,]
dim(caco_dat)
dim(dat)

# getting quantile categories from the subcohort
qdata = quantize(caco_dat[caco_dat$subcohort,], expnms=expnms)
qdata$breaks
# 2) doing simple (unstratified) analysis
f2 = survival::Surv(time, d)~x1 + x2 + z
(obj2 <- qgcomp.cch.noboot(f2, expnms = expnms, breaks = qdata$breaks, 
         data = caco_dat, subcoh = ~ subcohort, id = ~id, cohort.size=N))
obj2$fit

### doing stratified analysis (if subcohort and/or cases are a stratified sample)
# 1) sampling stratified case-cohort data
sampfracs = c(.25, .75) # z=0 vs. z=1
nco = 100 # total subcohort members
nca = nrow(dat[dat$d==1,])
selected_ids = sort(c(sample(dat[dat$z==0, "id"], round(nco*sampfracs[1])), 
                 sample(dat[dat$z==1, "id"],  round(nco*sampfracs[2]))))
selected_cases = sort(c(sample(dat[dat$d==1 & dat$z==0, "id"], round(nca*sampfracs[1])), 
                 sample(dat[dat$d==1 & dat$z==1, "id"], round(nca*sampfracs[1]))))
dat$subcohort = dat$id %in% selected_ids
dat$selectedcases = dat$id %in% selected_cases
caco_dat_strat = dat[dat$subcohort | dat$selectedcases,]
dim(caco_dat_strat)
dim(dat)

# getting quantile categories from the subcohort by differential sampling across strata
subco_strat = caco_dat_strat[caco_dat_strat$subcohort,]
z_stratum_sizes = table(dat$z)
z_stratum_sizes_caco = table(subco_strat$z)
sampweights = z_stratum_sizes/z_stratum_sizes_caco
sampweightsn = sampweights/(min(sampweights))

# now oversample the undersampled into a dataset used to create cutpoints
wtdcutids = data.frame(id=sort(c(sample(subco_strat[subco_strat$z==0,"id"], 
                sampweightsn[1]*z_stratum_sizes_caco[1], replace=TRUE), 
                sample(subco_strat[subco_strat$z==1,"id"], 
                sampweightsn[2]*z_stratum_sizes_caco[2]))))
cutdata = merge(wtdcutids,subco_strat, all.x=TRUE)                 
qdata_strat = quantize(cutdata, expnms=expnms)
qdata_strat$breaks
f2s = survival::Surv(time, d)~x1 + x2
(obj2s <- qgcomp.cch.noboot(f2s, expnms = expnms, breaks = qdata_strat$breaks, 
         data = caco_dat_strat, subcoh = ~ subcohort, id = ~id, 
         stratum=~z,
         cohort.size=z_stratum_sizes, method="I.Borgan"))
obj2s$fit

qgcomp documentation built on April 12, 2025, 2:28 a.m.