View source: R/km_type_weights.R
compute_km_weights | R Documentation |
Compute Kaplan-Meier type weights for (matched) nested case-control (NCC) sample
compute_km_weights( cohort = NULL, ncc = NULL, id_name = NULL, risk_table_manual = NULL, t_start_name = NULL, t_name = NULL, sample_stat = NULL, t_match_name = t_name, y_name = NULL, match_var_names = NULL, n_per_case, return_risk_table = FALSE, km_names = c("km_prob", "km_weight") )
cohort |
Cohort data with at least the following information on each
subject: start time (if not 0 for all subjects) and end time of follow-up,
censoring status and matching variables (if any). A |
ncc |
(Matched) NCC data, if |
id_name |
Name of the column indicating subject ID in |
risk_table_manual |
Number of subjects at risk at time of each cases in
the NCC, if |
t_start_name |
Name of the variable in |
t_name |
Name of the variable in |
sample_stat |
A numeric vector containing sampling and status
information for each subject in |
t_match_name |
Name of the column of event time in each matched set in
|
y_name |
Name of the column of censoring status in |
match_var_names |
Name(s) of the match variable(s) in |
n_per_case |
Number of controls matched to each case. |
return_risk_table |
Whether the risk table should be returned. Default
is |
km_names |
Column names for the KM-type probability (the first element)
and weight (the second element) computed, if these two columns are to be
attached to each subject in the input data. Default is
|
When the full cohort is not available, in order to compute the
correct weights for each sampled control in the NCC sample, it is important
to keep the actual time of event or censoring of each subject in the NCC
sample, which should be specified as t_name
in the input. Since the
number of subjects in each risk set will be supplied separately (i.e., as
n_at_risk
) in such scenario, t_match_name
is required to
map each control to the appropriate risk set. t_match_name
may be
the same as t_name
if the exact risk set is available in
n_at_risk
, but when the full cohort is not available the risk set is
usually approximated by using a coarsened version of t_name
. For
example, when controls were drawn from a population registry by matching on
the exact date of death of cases, birth cohort and gender, the number at
risk may be approximated by using the population size in the year of event
in the same birth cohort of the same gender. In this scenario
t_match_name
would be the year of t_name
.
compute_risk_table
library(SamplingDesignTools) # Load mini cohort data("mini_cohort") mini_cohort # Manually prepare a 1:1 NCC data ncc <- rbind( data.frame(Set = 1, Map = c(1, 5), Time = mini_cohort$t[1], Fail = c(1, 0), t = mini_cohort$t[c(1, 5)]), data.frame(Set = 2, Map = c(3, 4), Time = mini_cohort$t[3], Fail = c(1, 0), t = mini_cohort$t[c(3, 4)]), data.frame(Set = 3, Map = c(4, 10), Time = mini_cohort$t[4], Fail = c(1, 0), t = mini_cohort$t[c(4, 10)]), data.frame(Set = 4, Map = c(6, 7), Time = mini_cohort$t[6], Fail = c(1, 0), t = mini_cohort$t[c(6, 7)]), data.frame(Set = 5, Map = c(8, 9), Time = mini_cohort$t[8], Fail = c(1, 0), t = mini_cohort$t[c(8, 9)]), data.frame(Set = 6, Map = c(9, 10), Time = mini_cohort$t[9], Fail = c(1, 0), t = mini_cohort$t[c(9, 10)]) ) rownames(ncc) <- NULL ncc # Map the NCC sample to the original cohort, break the matching, identify the # subjects selected into the NCC, and return this subset with KM type weights # computed for them. # First create the sampling and status indicator: sample_stat <- numeric(nrow(mini_cohort)) sample_stat[unique(ncc$Map[ncc$Fail == 0])] <- 1 sample_stat[ncc$Map[ncc$Fail == 1]] <- 2 # Then find the sampled subset and compute weights: ncc_nodup <- compute_km_weights( cohort = mini_cohort, t_name = "t", y_name = "status", sample_stat = sample_stat, n_per_case = 1 ) ncc_nodup # Alternatively, if the cohort is not available, the weights can be computed # as long as number of subjects at risk at event times in each strata is # available elsewhere, and the actual time of event/censoring is available # for each subject in the NCC. # Compute the number of subjects at risk from mini_cohort: risk_table <- compute_risk_table(cohort = mini_cohort, t_name = "t", y_name = "status") risk_table # The following command computes the same weights as in ncc_nodup: ncc_nodup_v2 <- compute_km_weights( ncc = ncc[, -1], risk_table_manual = risk_table, id_name = "Map", t_match_name = "Time", t_name = "t", y_name = "Fail", n_per_case = 1 ) ncc_nodup_v2
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.