Description Usage Arguments Details Value References Examples
View source: R/SAT.stage1.sampling.r
This function implements the stage 1 subsampling of SAT by SGS method.
1 | SAT.stage1.sampling(r1, n, S, Rpar = 0.5)
|
r1 |
pilot subsample size. |
n |
total sample size. |
S |
a binary vector of length n. Surrogate observations for all samples. |
Rpar |
case proportion parameter. The recommended range is (0.3, 0.6), and default is 0.5. |
The region of Rpar that corresponds to lower MSEs is (0.3, 0.5) for case prevalence (i.e., P(Y=1|X)) around 4%. To avoid failures in the estimation when the case prevalence is low and r1 is small, a slightly larger Rpar in (0.5, 0.6) can be used without compromising the performance of SAT. Using Rpar=0.5 is a safe choice for most situations.
The function returns a vector of patient index for whom the manual chart reviews are going to be collected.
Liu, X., Chubak, J., Hubbard, R. A. & Chen, Y. (2021). SAT: a Surrogate Assisted Two-wave case boosting sampling method, with application to EHR-based association studies.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | library(SAT)
set.seed(0)
n <- 1e5
beta0 <- c(1/5, 0, 0, 1/2, rep(1/2, 4))
d <- length(beta0)
X <- rnorm(n*(d-1), -1.5, 1)
X <- matrix(X, nrow = n, ncol = d - 1)
X <- cbind(1, X)
P <- 1 - 1 / (1 + exp(X %*% beta0))
Y <- rbinom(n, 1, P)
a1 <- 0.85 # sensitivity
a2 <- 0.95 # specificity
pr_s <- vector(mode = "numeric", length = n)
pr_s <- a1*(Y==1) + (1-a2)*(Y==0)
S <- rbinom(n, 1, pr_s)
stage1.index <- SAT.stage1.sampling(r1 = 400, n = 1e5, S, Rpar = 0.5)
length(stage1.index)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.