library("learnr") library("survival") load("Data.RData") knitr::opts_chunk$set(echo = FALSE) options(tutorial.exercise.timelimit = 1200)
The following questions test your knowledge in Chapters 1 - 3.
In a clinical study interest lies on the survival of HIV-infected patients after seroconversion. The Kaplan-Meier estimate at year 1 equals 0.88
quiz( question("Which of the following statements are correct (more than one correct is possible)?", answer("Given that a patient is alive at year 1, the instantaneous risk of death just after year 1 is 0.88."), answer("For the target population and on average we expect 88% of the patients to live more than 1 year.", correct = TRUE), answer("The cumulative risk of death at year 1 equals 0.88."), answer("The estimated survival function at year 1 equals 0.88, meaning that we expect 12% of the patients to die within 1 year.", correct = TRUE), answer("The estimated survival function at year 1 equals 0.12, meaning that we expect 12% of the patients to live more than 1 year."), answer("The cumulative distribution function equals 0.12.", correct = TRUE), answer("The cumulative distribution function equals 0.88."), allow_retry = TRUE, random_answer_order = TRUE ) )
A study has been designed to investigate whether a new therapy improves the survival rates of advanced cancer patients. You have at hand the survival times of the two groups of patients, namely, the patients taking the new treatment and the patients with the standard treatment.
quiz( question("Which of the following types of analysis would you follow to investigate if the new treatment works?", answer("Perform a two-sample t-test for the two groups of patients to test if the mean survival time in the new treatment group is greater than the mean survival time in the standard treatment group."), answer("First check if the data are normally distributed; if yes, perform a two-sample t-test, otherwise perform a two-sample Wilcoxon test to test differences in the medians between the two treatment groups."), answer("Perform a log-rank test to compare the survival distributions of the two groups.", correct = TRUE), answer("Perform a paired t-test for the two groups of patients."), allow_retry = TRUE, random_answer_order = TRUE ) )
A study has been designed to investigate whether a new therapy improves the survival rates of advanced cancer patients. You have at hand the survival times of the two groups of patients, namely, the patients taking the new treatment and the patients with the standard treatment.
quiz( question("Which of the following types of analysis would you follow to investigate if the new treatment works?", answer("Perform a log-rank test to compare the survival distributions of the two groups."), answer("Perform a Peto and Peto Gehan-Wilcoxon test to compare the survival distributions of the two groups."), answer("Check graphically if the proportional hazards assumption is satisfied. If it seems to be satisfied, then use the log rank test.", correct = TRUE), answer("Check graphically if the proportional hazards assumption is satisfied. If it seems to be satisfied, then use the Peto and Peto Gehan-Wilcoxon test."), allow_retry = TRUE, random_answer_order = TRUE ) )
The purpose of this practical is to illustrate how standard statistical analysis of survival data can be performed in R.
The following questions are based on the AIDS dataset. This dataset is available as the
object aids.id
and is already loaded in this session. From this dataset we will use the
following variables:
Time
: the observed time-to-death in months.
death
: the event indicator; '1' denotes death and '0' censored observation.
drug
: the treatment indicator with values 'ddC' and 'ddI'.
gender
: the sex indicator with values 'male' and 'female'.
For the exercises below it will be useful to check the corresponding sections of the Survival Analysis in R Companion that are mentioned in the hints.
Calculate and plot the Kaplan-Meier estimator of the survival function based on all the data. What is the median survival time and its 95% confidence interval?
# Check the example in slides 73-74, and Section 2.1, Survival Analysis in R Companion
# Calculate the Kaplan-Meier estimator and check the output fitKM <- survfit(Surv(Time, death) ~ 1, data = aids.id) fitKM
# Plot the Kaplan-Meier estimator plot(fitKM)
Calculate and plot the Breslow estimator of the survival functions for ddC and ddI,
separately. Calculate also the estimates of the 50%, 60% and 70% percentiles of the
survival distribution with their 95% confidence intervals. Name the Breslow estimator
object fitB
.
# Check the example in slides 86 & 80, and Section 2.1, Survival Analysis in R Companion
# Calculate the Breslow estimator and check the output fitB <- survfit(Surv(Time, death) ~ drug, data = aids.id, type = "fleming-harrington") fitB
# Plot the Breslow estimator plot(fitB, lty = 1:2, col = 1:2)
# Use the quantile() function quantile(fitB, 1 - c(0.5, 0.6, 0.7))
Using the Breslow estimator fitB
of the previous question, calculate the 8- and 10-month
survival probability with its corresponding 95% confidence interval.
fitB <- survfit(Surv(Time, death) ~ drug, data = aids.id, type = "fleming-harrington")
# you will need to use function summary() and its argument 'times' # Check Section 2.1, Survival Analysis in R Companion
# The code is: summary(fitB, times = c(8, 10))
Compare with the log-rank Peto & Peto modified Gehan-Wilcoxon tests if the survival curves for the two treatment groups differ statistically significantly. Before doing the analysis, which of the two tests you expect to yield the smaller p-value and why?
# Check the example in slides 101 & 109, and Section 2.2, Survival Analysis in R Companion
# log-rank test survdiff(Surv(Time, death) ~ drug, data = aids.id)
# Gehan-Wilcoxon test survdiff(Surv(Time, death) ~ drug, data = aids.id, rho = 1)
Do the same for gender, i.e., calculate the Kaplan-Meier estimator of the survival functions for males and females, and compare the results from the log-rank and Peto & Peto modified Gehan-Wilcoxon tests. Which test you should trust more in this case and why?
# first calculate the Kaplan-Meier estimator and do the graph
# The code is: fitKM_gender <- survfit(Surv(Time, death) ~ gender, data = aids.id) fitKM_gender plot(fitKM_gender)
# Use survdiff() as in Question 4
# log-rank test survdiff(Surv(Time, death) ~ gender, data = aids.id)
# Gehan-Wilcoxon test survdiff(Surv(Time, death) ~ gender, data = aids.id, rho = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.