SamplingDesignTools: Tools for Dealing with Complex Sampling Designs

compute_kmw_cohort

R Documentation

<Private function> Compute KM-type weights for NCC sample given full cohort

Description

<Private function> Compute KM-type weights for NCC sample given full cohort

Usage

compute_kmw_cohort(
  cohort,
  t_start_name = NULL,
  t_name,
  sample_stat,
  match_var_names = NULL,
  n_per_case,
  return_risk_table = FALSE,
  km_names = c(".km_prob", ".km_weight")
)

Arguments

`cohort`	Cohort data with at least the following information on each subject: start time (if not 0 for all subjects) and end time of follow-up, censoring status and matching variables (if any). A `data.frame` or a matrix with column names.
`t_start_name`	Name of the variable in `cohort_skeleton` for the start time of follow-up. A `string`. Default is `NULL`, where all subjects started the follow-up at time 0.
`t_name`	Name of the variable in `cohort` for the event or censoring time. A `string`.
`sample_stat`	A numeric vector containing sampling and status information for each subject in `cohort`: use 0 for non-sampled controls, 1 for sampled (and kept) controls, and integers >=2 for events. The length of this vector must be the same as the number of rows in `cohort`.
`match_var_names`	Name(s) of the match variable(s) in `cohort` used when drawing the NCC. A `string` vector. Default is `NULL`, i.e., the NCC was only time-matched.
`n_per_case`	Number of controls matched to each case.
`return_risk_table`	Whether the risk table should be returned. Default is `FALSE`.
`km_names`	Column names for the KM-type probability (the first element) and weight (the second element) computed, if these two columns are to be attached to each subject in the input data. Default is `c(".km_prob", ".km_weight")`.

Value

If return_risk_table = FALSE (the default), returns the subcohort of sampled subjects with the appropriate KM-type probability and weight attached to each subject. If return_risk_table = TRUE, returns a list containing this subcohort (dat) and the risk table (risk_table), which is a data.frame containing the distinct event time (t_event), matching variables (if any), and the number of subject at risk at each event time in each strata defined by matching variables (n_at_risk).