phregr: Proportional Hazards Regression Models

View source: R/wrappers.R

phregrR Documentation

Proportional Hazards Regression Models

Description

Obtains the hazard ratio estimates from the proportional hazards regression model with right censored or counting process data.

Usage

phregr(
  data,
  rep = "",
  stratum = "",
  time = "time",
  time2 = "",
  event = "event",
  covariates = "",
  weight = "",
  offset = "",
  id = "",
  ties = "efron",
  robust = FALSE,
  est_basehaz = TRUE,
  est_resid = TRUE,
  firth = FALSE,
  plci = FALSE,
  alpha = 0.05
)

Arguments

data

The input data frame that contains the following variables:

  • rep: The replication for by-group processing.

  • stratum: The stratum.

  • time: The follow-up time for right censored data, or the left end of each interval for counting process data.

  • time2: The right end of each interval for counting process data. Intervals are assumed to be open on the left and closed on the right, and event indicates whether an event occurred at the right end of each interval.

  • event: The event indicator, 1=event, 0=no event.

  • covariates: The values of baseline covariates (and time-dependent covariates in each interval for counting process data).

  • weight: The weight for each observation.

  • offset: The offset for each observation.

  • id: The optional subject ID for counting process data with time-dependent covariates.

rep

The name(s) of the replication variable(s) in the input data.

stratum

The name(s) of the stratum variable(s) in the input data.

time

The name of the time variable or the left end of each interval for counting process data in the input data.

time2

The name of the right end of each interval for counting process data in the input data.

event

The name of the event variable in the input data.

covariates

The vector of names of baseline and time-dependent covariates in the input data.

weight

The name of the weight variable in the input data.

offset

The name of the offset variable in the input data.

id

The name of the id variable in the input data.

ties

The method for handling ties, either "breslow" or "efron" (default).

robust

Whether a robust sandwich variance estimate should be computed. In the presence of the id variable, the score residuals will be aggregated for each id when computing the robust sandwich variance estimate.

est_basehaz

Whether to estimate the baseline hazards. Defaults to TRUE.

est_resid

Whether to estimate the martingale residuals. Defaults to TRUE.

firth

Whether to use Firth’s penalized likelihood method. Defaults to FALSE.

plci

Whether to obtain profile likelihood confidence interval.

alpha

The two-sided significance level.

Value

A list with the following components:

  • sumstat: The data frame of summary statistics of model fit with the following variables:

    • n: The number of observations.

    • nevents: The number of events.

    • loglik0: The (penalized) log-likelihood under null.

    • loglik1: The maximum (penalized) log-likelihood.

    • scoretest: The score test statistic.

    • niter: The number of Newton-Raphson iterations.

    • ties: The method for handling ties, either "breslow" or "efron".

    • p: The number of columns of the Cox model design matrix.

    • robust: Whether to use the robust variance estimate.

    • firth: Whether to use Firth's penalized likelihood method.

    • loglik0_unpenalized: The unpenalized log-likelihood under null.

    • loglik1_unpenalized: The maximum unpenalized log-likelihood.

    • rep: The replication.

  • parest: The data frame of parameter estimates with the following variables:

    • param: The name of the covariate for the parameter estimate.

    • beta: The log hazard ratio estimate.

    • sebeta: The standard error of log hazard ratio estimate.

    • z: The Wald test statistic for log hazard ratio.

    • expbeta: The hazard ratio estimate.

    • vbeta: The covariance matrix for parameter estimates.

    • lower: The lower limit of confidence interval.

    • upper: The upper limit of confidence interval.

    • p: The p-value from the chi-square test.

    • method: The method to compute the confidence interval and p-value.

    • sebeta_naive: The naive standard error of log hazard ratio estimate if robust variance is requested.

    • vbeta_naive: The naive covariance matrix for parameter estimates if robust variance is requested.

    • rep: The replication.

  • basehaz: The data frame of baseline hazards with the following variables (if est_basehaz is TRUE):

    • time: The observed event time.

    • nrisk: The number of patients at risk at the time point.

    • nevent: The number of events at the time point.

    • haz: The baseline hazard at the time point.

    • varhaz: The variance of the baseline hazard at the time point assuming the parameter beta is known.

    • gradhaz: The gradient of the baseline hazard with respect to beta at the time point (in the presence of covariates).

    • stratum: The stratum.

    • rep: The replication.

  • residuals: The martingale residuals.

  • p: The number of parameters.

  • param: The parameter names.

  • beta: The parameter estimate.

  • vbeta: The covariance matrix for parameter estimates.

  • vbeta_naive: The naive covariance matrix for parameter estimates.

  • terms: The terms object.

  • xlevels: A record of the levels of the factors used in fitting.

  • data: The input data.

  • rep: The name(s) of the replication variable(s).

  • stratum: The name(s) of the stratum variable(s).

  • time: The name of the time varaible.

  • time2: The name of the time2 variable.

  • event: The name of the event variable.

  • covariates: The names of baseline covariates.

  • weight: The name of the weight variable.

  • offset: The name of the offset variable.

  • id: The name of the id variable.

  • ties: The method for handling ties.

  • robust: Whether a robust sandwich variance estimate should be computed.

  • est_basehaz: Whether to estimate the baseline hazards.

  • est_resid: Whether to estimate the martingale residuals.

  • firth: Whether to use Firth's penalized likelihood method.

  • plci: Whether to obtain profile likelihood confidence interval.

  • alpha: The two-sided significance level.

Author(s)

Kaifeng Lu, kaifenglu@gmail.com

References

Per K. Anderson and Richard D. Gill. Cox's regression model for counting processes, a large sample study. Annals of Statistics 1982; 10:1100-1120.

Terry M. Therneau and Patricia M. Grambsch. Modeling Survival Data: Extending the Cox Model. Springer-Verlag, 2000.

Examples


library(dplyr)

# Example 1 with right-censored data
(fit1 <- phregr(
  data = rawdata %>% mutate(treat = 1*(treatmentGroup == 1)),
  rep = "iterationNumber", stratum = "stratum",
  time = "timeUnderObservation", event = "event",
  covariates = "treat", est_basehaz = FALSE, est_resid = FALSE))

# Example 2 with counting process data and robust variance estimate
(fit2 <- phregr(
  data = heart %>% mutate(rx = as.numeric(transplant) - 1),
  time = "start", time2 = "stop", event = "event",
  covariates = c("rx", "age"), id = "id",
  robust = TRUE, est_basehaz = TRUE, est_resid = TRUE))


lrstat documentation built on Oct. 18, 2024, 9:06 a.m.