knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

The sm_regression() function performs kernel weighted regression. After the the model is specified (data = cancertx, method = "coxph", formula = Surv(time) ~ age), an ancillary variable is specified (weighting_var = "marker") and the model is estimated weighted on this variable.

The following examples focus on time-to-event endpoints. The example data set is simulated from a Cox regression model.

$$ h(t) = h_0(t)e^{\textbf{X}\beta} $$

with $h_0(t) = 1$ and $\textbf{X}\beta = 0.02 * age - 0.2 * marker$. The independent variables age and marker are independent from one another. The example data set containts the following columns:

time    Years from cancer treatment to death
age     Age at treatment
marker  Marker level at treatment
survt1  True 1 year survival probability
library(sjosmooth)
library(dplyr)
# loading data from sjosmooth github page
load(url("https://github.com/ddsjoberg/sjosmooth/blob/master/examples/cancertx.rda?raw=true"))

cancertx %>% select(time, age, marker, survt1) %>% head() %>% knitr::kable()

The simualted data contains r nrow(cancertx) %>% format(big.mark = ",", scientific=FALSE) observations.

Example

Kernel smoothing can be computationally intense on large data sets. The following code was run to get the example object sm_regression_ex1 and the results saved to GitHub.

library(survival)
library(sjosmooth)
sm_regression_ex1 =
  sm_regression(
    data = cancertx,
    method = "coxph",
    formula = Surv(time) ~ age,
    weighting_var = "marker",
    newdata = dplyr::data_frame(marker = seq(5, 15, 0.5))
  )

The data in the cancertx data frame was simulated from a Cox regression model with linear predictor.

$$ \textbf{X}\beta = 0.02 * age - 0.2 * marker $$

In the model age and marker level are independent, and therefore, the univariate model the slope coefficient for age remains $\beta_{age} = 0.02$

The beta coefficient is the same across all marker levels. In the figure below, the solid black line is the estimates beta coefficient (with 95% CI), and the dashed line is the true value.

library(ggplot2)
library(survival)
# loading example results from sjosmooth github page
load(url("https://github.com/ddsjoberg/sjosmooth/blob/master/examples/sm_regression_ex1.rda?raw=true"))

# plotting regression model coefficient by marker level
plot_data =
  sm_regression_ex1 %>%
  mutate(
    age.coef = purrr::map_dbl(model_obj, ~coef(.x)),
    age.ci = purrr::map(model_obj, ~confint(.x) %>% as_data_frame())
  ) %>%
  select(newdata, age.coef, age.ci) %>%
  tidyr::unnest(age.ci, newdata)

ggplot(plot_data, aes(x = marker, y = age.coef)) +
  geom_line() +
  geom_ribbon(aes(ymin = `2.5 %`, ymax = `97.5 %`), alpha = 0.3) +
  geom_hline(aes(yintercept = 0.02), linetype = "dashed") +
  scale_y_continuous(limits = c(0, 0.04)) +
  labs(
    y = "slope for age regressed on time to death"
  )


ddsjoberg/sjosmooth documentation built on May 14, 2019, 5:16 p.m.