Description Usage Arguments Details Value Author(s) References
ncc_sample
generates a nested case-control study dataset from a
cohort study dataset. Given time of entry, time of exit, and exit status,
risk sets are computed at each failure time. Controls are randomly sampled
from these risk sets. If matching variables are specified, ncc_sample
creates a matched or stratified nested case control study, in which risk
sets are computed separately within matching strata. ncc_sample
is
similar to ccwc
from the 'Epi' package, but differs in several small
but important ways (see details).
1 2 3 |
entry |
time of entry to follow-up |
exit |
time of exit from follow-up |
fail |
indicator of status on exit from follow-up (censored=0, fail=1) |
origin |
the origin of the analysis time-scale. For instance, date of birth, if age is the desired time-scale. |
controls |
the number of controls to sample for each failure |
match |
a list of categorical variables for matching cases and controls |
include |
a list of variables from the cohort dataset to be carried accross the the nested case-control dataset |
data |
a |
keep_all |
if |
silent |
if |
Given follow-up information from a cohort study, ncc_sample
generates
risk sets at each observed failure time, and randomly samples controls from
these risk sets without replacement. Functionality is much the same as
ccwc
from the 'Epi' package, with two minor differences. Firstly,
ncc_sample
also computes and returns the total number of eligible
controls for each risk set, as well as the probability of selection in to
the sample for every selected case and control. The latter is calculated
according to the formula given by Samuelsen (1997). Secondly,
ncc_sample
splits tied failure times at random, whereas ccwc
preserves the ties and returns a multi-case case-control set.
Random sampling of controls within risk sets is performed using R's
pseudo-random number facilities. It is therefore important to set the seed
(set.seed
) to ensure reproducibility.
A data.frame
comprising:
ncc_set |
a case-control set identifier |
ncc_id |
a unique individual identifier |
ncc_fail |
case identifier (0=control, 1=case) |
ncc_elig_co |
a count of the number of controls eligible for selection in the set |
ncc_time |
failure time of the case in the set |
ncc_pr |
the probability of being selected in the nested case-control sample |
followed by the variables specified in the match
and include
lists.
David C Muller
Samuelsen S. O. (1997). A psudolikelihood approach to analysis of nested case-control studies. Biometrika, 84(2), 379-394. doi:10.1093/biomet/84.2.379
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.