SAT.estimation: SAT estimation based on pooled subsample

Description Usage Arguments Value References Examples

View source: R/SAT.estimation.r

Description

This function implements the final SAT estimation based on data pooled from two stages of sampling. A weighted logistic regression is conducted.

Usage

1
2
SAT.estimation(S, X, beta.pilot, stage1.index, stage2.index, stage1.weights,
               stage1.y, stage2.y, method = "SAT-S")

Arguments

S

a binary vector of length n. Surrogate observations for all samples.

X

a matrix of dimension n times p (the first column needs to be 1). The covariate matrix contains observations for all n samples.

beta.pilot

the pilot estimator.

stage1.index

a vector of length r1. The index of pilot sampled patients.

stage2.index

a vector of length r. The index of second-stage sampled patients.

stage1.weights

a vector of weights for patients who are selected in pilot sampling.

stage1.y

a binary vector of length r1. The manual chart review results for patients in stage1.index.

stage2.y

a binary vector of length r. The manual chart review results for patients in stage2.index.

method

two methods are available: SAT-S or SAT-cY.

Value

The function returns the final SAT estimates of the association coefficients.

References

Liu, X., Chubak, J., Hubbard, R. A. & Chen, Y. (2021). SAT: a Surrogate Assisted Two-wave case boosting sampling method, with application to EHR-based association studies.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
library(SAT)
set.seed(0)

colnames(lung_cancer)
X <- cbind(1, lung_cancer[,3:5])
Y <- lung_cancer[,1]
S <- lung_cancer[,2]

# pilot sampling
stage1.index <- SAT.stage1.sampling(r1 = 400, n = 1e5, S, Rpar = 0.5)
# true phenotype collection
stage1.y <- Y[stage1.index]

# second stage sampling
stage2 <- SAT.stage2.sampling(r1 = 400, n = 1e5, S, Rpar = 0.5, r = 800,
                              stage1.index, stage1.y, X, method = "SAT-cY")
# true phenotype collection
stage2.y <-  Y[stage2$stage2.index]

# final estimation
SAT.est <- SAT.estimation(S, X, beta.pilot = stage2$beta.pilot, stage1.index = stage1.index,
               stage2.index = stage2$stage2.index,
               stage1.weights = stage2$stage1.weights,
               stage1.y = stage1.y, stage2.y = stage2.y,
               method = "SAT-cY")
SAT.est$SAT.estimate

Penncil/SAT documentation built on Dec. 18, 2021, 7:38 a.m.