Description Usage Arguments Details Note References Examples
Smooth Good-Toulmin estimate of Δ(t), the (expected) number of new variants in a future (test) cohort that is t times as large as the training cohort
1 |
counts |
vector of counts or frequencies of the observed variants. |
r |
unique frequencies. |
N_r |
frequency of frequency r. |
m |
training cohort size. |
t |
positive scalar. The proportion of the future (test) cohort size to the training cohort size. |
adj |
logical. Should the Orlitsky et al. adjustment be used?
Defaults to |
Computes the original Good Toulmin (1956) estimate of Δ(t) if t <= 1
. If
t > 1
, the Efron-Thisted estimate (if adj = FALSE
) or the
Efron-Thisted estimate with Orlitsky et al. (2016) adjustment (if adj = TRUE
) is computed. Also
returns an approximate standard error ("se") of the estimate as an attribute, computed using the formula
provided in Efron-Thisted (1976, equation 5.2).
Either (a) counts
, or (b) r
and
N_r
must be provided.
Good, I. J., & Toulmin, G. H. (1956). The number of new species, and the increase in population coverage, when a sample is increased. Biometrika, 43(1–2), 45–63. https://doi.org/10.1093/biomet/43.1-2.45.
Efron, B., & Thisted, R. (1976). Estimating the Number of Unseen Species: How Many Words Did Shakespeare Know? Biometrika, 63(3), 435–447. Retrieved from http://www.jstor.org/stable/2335721.
Orlitsky, A., Suresh, A. T., & Wu, Y. (2016). Optimal prediction of the number of unseen species. Proceedings of the National Academy of Sciences, 113(47), 13283–13288. https://doi.org/10.1073/pnas.1607774113
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | ## Not run:
# load tcga data
data("tcga")
tcga <- data.table::setDT(tcga)
# calculate variant frequencies
var_freq <- tcga[,
.(v_f = length(unique(patient_id))),
by = .(Hugo_Symbol, Variant)
]
# calculate cohort size
m <- length(unique(tcga$patient_id))
# SGT Delta(t) estimate for t = 0.5, 1, 10
sgt_Delta(counts = var_freq$v_f, m = m, t = 0.5)
sgt_Delta(counts = var_freq$v_f, m = m, t = 1)
sgt_Delta(counts = var_freq$v_f, m = m, t = 10)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.