analyse_mecc_cond: Perform weighted analysis of (M)ECC data using the...

View source: R/extreme_case_control.R

analyse_mecc_condR Documentation

Perform weighted analysis of (M)ECC data using the conditional approach

Description

Perform weighted analysis of (M)ECC data using the conditional approach

Usage

analyse_mecc_cond(
  y_name,
  x_formula,
  set_id_name,
  surv,
  surv_tau,
  mecc,
  lower = -10,
  upper = 10
)

Arguments

y_name

A string indicating cases and controls in mecc. Note that this is not the original indicator for event/censoring.

x_formula

A formula object specifying the model assumed for the covariates, starting with ~ (but does not include the outcome variable).

set_id_name

A string indicating the ID of matched sets in the (M)ECC data. See Details.

surv

Estimated baseline survival probability for each subject in mecc, based on the underlying cohort. The length of this variable must be the same as the number of rows in mecc. See Details.

surv_tau

Estimated baseline survival probability evaluate at time tau, based on the underlying cohort. The length of this variable must be the same as the number of rows in mecc. See Details.

mecc

(M)ECC data. A data.frame. Make sure the covariates in x_formula are all centred at cohort average.

lower, upper

Parameters of the function optim, which specifies the lower and upper bounds of the estimated coefficient for method "Brent" if there is only one coefficient.

Details

This function uses the function optim to estimate the regression coefficients and the Hessian matrix. The default method of optim (i.e., "Nelder-Mead") is used if there are more than 1 coefficient to estimate, otherwise the "Brent" method is used, with the search range given by lower and upper.

Value

Returns a list consisting of two components:

coef_mecc

A data.frame containing the name of covariates in the regression model (var), estimated regression coefficients (est), estimated hazard ratios (exp_est = exp(est)) standard error of the estimates (se), and the p-values (pval).

optim_obj

The optim output from maximising the log-likelihood in the weighted analysis.

Author(s)

Yilin Ning, Nathalie C Støer

References

  • Salim A, Ma X, Fall K, et al. Analysis of incidence and prognosis from 'extreme' case–control designs. Stat Med 2014; 33: 5388–5398.

  • Støer NC, Salim A, Bokenberger K, et al. Is the matched extreme case–control design more powerful than the nested case–control design?. Stat Methods Med Res 2019; 28(6): 1911-1923.

See Also

draw_mecc, llh_mecc_cond

Examples

library(SamplingDesignTools)
# Load cohort data
data("cohort_1")
head(cohort_1)
# Draw simple 1:2 more extreme case-control sample, matched on gender.
# Let cases be subjects who had the event within 5 years, and controls be
# selected from those who did not have the event until the 15-th year.
set.seed(1)
dat_mecc <- draw_mecc(cohort = cohort_1, tau0 = 5, tau = 15,
                      id_name = "id", t_name = "t", delta_name = "y",
                      match_var_names = "gender", n_per_case = 2)
head(dat_mecc)
# To estimate the HR of age from MECC sample using the weighted approach,
# it is necessary to center age at the cohort average:
dat_mecc$age_c <- dat_mecc$age - mean(cohort_1$age)
result_mecc <- analyse_mecc_cond(
  y_name = "y_mecc", x_formula = ~ age_c, set_id_name = "set_id_mecc",
  surv = dat_mecc$surv, surv_tau = dat_mecc$surv_tau, mecc = dat_mecc,
  lower = -1, upper = 1
)
round(result_mecc$coef_mat[, -1], 3)
# Compare with the estimate from the full cohort:
library(survival)
result_cohort <- summary(coxph(Surv(t, y) ~ age + gender, data = cohort_1))$coef
round(result_cohort["age", ], 3)
# The MECC sample may also be analysed using a logistic regression to
# estimate the OR of age, which tends to overestimate the HR:
result_logit <- summary(glm(y_mecc ~ age + gender, data = dat_mecc))$coef
round(result_logit, 3)

nyilin/SamplingDesignTools documentation built on Nov. 20, 2022, 8:07 a.m.