cchsData | R Documentation |
A case–cohort dataset where the subcohort was selected by stratified simple random sampling. This is an artificial dataset that was made from nwtco
, a real dataset from the National Wilms Tumor Study (NWTS). It is designed for demonstrating the use of cchs
.
id | An ID number. |
localHistol | Result of the histology from the local institution. |
centralLabHistol | Result of the histology from the central laboratory. |
stage | Stage of the cancer (I, II, III, or IV). |
study | The study (NWTS-3 or NWTS-4). For details see this NWTS webpage (archived copy). |
isCase | Indicator for whether this participant had a relapse or not. |
time | Number of days from diagnosis of Wilms tumor to relapse or censoring. |
ageAtDiagnosis | Age in years at diagnosis of Wilms tumor. |
inSubcohort | Indicator for whether this participant is in the subcohort or not. |
sampFrac | The sampling fraction for the stratum that contains this participant. |
The nwtco
data is from two clinical trials but can be regarded as cohort data. cchsData
can be created from it by running the code in the Source section below, which is partly based on the Examples section of the cch
documentation.
Two strata are used for the subcohort-selection, corresponding to the two values of localHistol
. The sampling fraction is 5% for the stratum defined by localHistol="favorable"
and 20% for the stratum defined by localHistol="unfavorable"
. After the subcohort is selected, the sampling fractions are recalculated using the exact integer numbers of participants in the subcohort and the full cohort, and then stored in the data-frame.
As an alternative to the sampling fractions, the stratum sizes in the full cohort could be used. A suitable value for the cohortStratumSizes
argument to cchs
would be c(favorable=3622,
unfavorable=406)
. This can be worked out by entering table(nwtco$instit, useNA="always")
and noting that for nwtco$instit
and nwtco$histol
, a value of 1
means “favorable histology result” and 2
means “unfavorable”—this is not stated in the nwtco
documentation but can be deduced from the line in the cch
examples that contains labels=c("FH","UH")
, or by comparing the output of the table
command with the numbers in Table 1 of Breslow & Chatterjee (1999).
For information about the two clinical trials, NWTS-3 and NWTS-4, see D'Angio et al. (1989) and Green et al. (1998) respectively, or the National Wilms Tumor Study website (archived copy).
# Starting with nwtco, rename variables, convert some to factors, drop # in.subcohort (which is used elsewhere for a different simulated dataset), etc. library(survival, quietly=TRUE) cchsData <- data.frame( id = nwtco$seqno, localHistol = factor(nwtco$instit, labels=c("favorable", "unfavorable")), centralLabHistol = factor(nwtco$histol, labels=c("favorable", "unfavorable")), stage = factor(nwtco$stage, labels=c("I", "II", "III", "IV")), study = factor(nwtco$study, labels=c("NWTS-3", "NWTS-4")), isCase = as.logical(nwtco$rel), time = nwtco$edrel, ageAtDiagnosis = nwtco$age / 12 # nwtco$age is in months ) # Define the intended sampling fractions for the two strata. samplingFractions <- c(favorable=0.05, unfavorable=0.2) # Select participants/rows to be in the subcohort by stratified simple random # sampling. cchsData$inSubcohort <- rep(FALSE, nrow(cchsData)) set.seed(1) for (stratumName in levels(cchsData$localHistol)) { inThisStratum <- cchsData$localHistol == stratumName stratumSubcohortSize <- round(samplingFractions[stratumName] * sum(inThisStratum)) rowsToSetTrue <- sample(which(inThisStratum), size=stratumSubcohortSize) cchsData$inSubcohort[rowsToSetTrue] <- TRUE } # Change the sampling fractions to their exact values. stratumSubcohortSizes <- table(cchsData$localHistol[cchsData$inSubcohort]) stratumCohortSizes <- table(cchsData$localHistol) samplingFractions <- stratumSubcohortSizes / stratumCohortSizes samplingFractions <- c(samplingFractions) # make it a vector, not a table # Keep only the cases and the subcohort. cchsData <- cchsData[cchsData$isCase | cchsData$inSubcohort,] # Put the sampling fraction in each row of the data-frame. cchsData$sampFrac <- samplingFractions[match(cchsData$localHistol, names(samplingFractions))]
Note: doi links are shown where these pass CRAN checks and appear correctly in the PDF reference manual. In other cases, URLs are shown.
Breslow, N.E., Chatterjee, N. (1999). Design and analysis of two-phase studies with binary outcome applied to Wilms tumour prognosis. Journal of the Royal Statistical Society: Series C (Applied Statistics) 48 (4), 457–468. https://doi.org/10.1111/1467-9876.00165
D'Angio, G.J., Breslow, N., Beckwith, J.B., Evans, A., Baum, E., Delorimier, A., Fernbach, D., Hrabovsky, E., Jones, B., Kelalis, P., Othersen, H.B., Tefft, M., Thomas, P.R.M. (1989). Treatment of Wilms' tumor: Results of the third National Wilms' Tumor Study. Cancer 64 (2), 349–360. https://doi.org/bc95fv
Green, D.M., Breslow, N.E., Beckwith, J.B., Finklestein, J.Z., Grundy, P.E., Thomas, P.R., Kim, T., Shochat, S.J., Haase, G.M., Ritchey, M.L., Kelalis, P.P., D'Angio, G.J. (1998). Comparison between single-dose and divided-dose administration of dactinomycin and doxorubicin for patients with Wilms' tumor: a report from the National Wilms' Tumor Study Group. Journal of Clinical Oncology 16 (1), 237–245. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1200/JCO.1998.16.1.237")}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.