Biostatistics III in R

Exercise 25. Localised melanoma : generating and analysing a nested case-control (NCC) study


library('knitr')
read_chunk('../q25.R')
opts_chunk$set(cache=FALSE)

You may have to install the required packages the first time you use them. You can install a package by install.packages("package_of_interest") for each package you require.


Load the melanoma data. Restrict the data to the localised status of the melanoma and to the 10 firts years of follow-up. Use the time-on-study as the timescale. define the event as death from cancer.


The Cox proportional hazard analysis is done thanks to the "coxph" command in the "survival" package. It is often a good idea to check first the structure of the variables.


(a) ##

How many individuals are in the study?


(b) ##

How many experience the event?


(c1) ##

Generate a nested case-control study with 1 control per case. Cases and controls are matched on time since entry. This is done by using the function "cwcc" in the Epi package. Note that in the codes provided here, the variables "dc" and "id" are not necessary. They permit however to understand how the data are handled as for eample: how many individuals are sampled several times or how many cases happened to be sampled as controls before their failure time.


(c2) ##

Analyse the nested case-control data (with the survival package or the Epi package) and function "clogit". As the ncc data was generated with the Epi package, we use the syntax of this package.


(d) ##

How many unique individuals are in our study?


(e) ##

Compare the estimated parameters and standard errors between the full cohort anal- ysis and the nested case-control study. What is the relative efficiency (ratio of vari- ances) of the nested case-control compared to the full cohort design?


(f) ##

Generate several nested case-control study and analyse them. A loop is generated with M sampling of NCC with 1 control per case. The codes provide the estimated HR for each loop i an data frame in which the first line contains the cohort's HR. The codes provide also a summary table with the cohort's HR and the mean and sd of the HR provided by the M loops. The histograms for each of the variables include a green vertical line at the cohort's HR value and a red line at the HR loops' mean.




Try the biostat3 package in your browser

Any scripts or data that you put into this service are public.

biostat3 documentation built on Oct. 29, 2024, 5:07 p.m.