Description Usage Arguments Value References Examples
View source: R/Rpackage_Tian.R
Constraint maximum likelihood (CML) method for logistic regression (binary outcome Y)
| 1 2 3 4 5 6 7 8 9 10 11 12 13 | fxnCC_LogReg(
  p,
  q,
  YInt,
  XInt,
  BInt,
  betaHatExt,
  gammaHatInt,
  n,
  tol,
  maxIter,
  factor
)
 | 
| p | total number of X covariates including the intercept (i.e. p=ncol(X)+1) | 
| q | total number of covariates including the intercept (i.e. q=ncol(X)+ncol(B)+1) | 
| YInt | Outcome vector | 
| XInt | X covariates that are used in the external models - Do not include intercept | 
| BInt | Newly added B covariates that are not included in the external models | 
| betaHatExt | External parameter estimates of the reduced model | 
| gammaHatInt | Full model parameter estimates using the internal data only | 
| n | internal data sample size | 
| tol | convergence criteria e.g. 1e-6 | 
| maxIter | Maximum number of iterations to reach convergence e.g. 400 | 
| factor | the step-halving factor between 0 and 1, if factor=1 then newton-raphson method; decrease if algorithm cannot converge given the maximum iterations | 
gammaHat, in the order (intercept, XInt, BInt)
Chatterjee, N., Chen, Y.-H., P.Maas and Carroll, R. J. (2016). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. Journal of the American Statistical Association 111, 107-117.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | # Full model: Y|X, B
# Reduced model: Y|X
# X,B follows normal distribution with mean zero, variance one and correlation 0.3
# Y|X, B follows N(-1-0.5X+0.5B, 1)
set.seed(2333)
n = 1000
data.n = data.frame(matrix(ncol = 3, nrow = n))
colnames(data.n) = c('Y', 'X', 'B')
data.n[,c('X', 'B')] = MASS::mvrnorm(n, rep(0,2), diag(0.7,2)+0.3)
data.n$Y = rbinom(n, 1, expit(-1 - 0.5*data.n$X + 0.5*data.n$B))
# Generate the beta estimates from the external reduced model:
# generate a data of size m from the full model first, then fit the reduced regression 
# to obtain the beta estiamtes and the corresponsing estimated variance 
m = 30000
data.m = data.frame(matrix(ncol = 3, nrow = m))
names(data.m) = c('Y', 'X', 'B')
data.m[,c('X', 'B')] = MASS::mvrnorm(m, rep(0,2), diag(0.7,2)+0.3)
data.m$Y = rbinom(m, 1, expit(-1 - 0.5*data.m$X + 0.5*data.m$B))
#fit Y|X to obtain the external beta estimates, save the beta estiamtes and the 
# corresponsing estimated variance 
fit.E = glm(Y ~ X, data = data.m, family = binomial(link='logit'))
beta.E = coef(fit.E)
names(beta.E) = c('int', 'X')
V.E = vcov(fit.E)
#get full model estimate from direct regression using the internal data only
fit.gamma.I = glm(Y ~ X + B, data = data.n, family = binomial(link='logit'))
gamma.I = coef(fit.gamma.I)
#Get CML estimates
gamma.CML = fxnCC_LogReg(p=2, 
                         q=3, 
                         YInt=data.n$Y, 
                         XInt=data.n$X, 
                         BInt=data.n[,'B'], 
                         betaHatExt=beta.E, 
                         gammaHatInt=gamma.I, 
                         n=nrow(data.n), 
                         tol=1e-8, 
                         maxIter=400,
                         factor=1)[["gammaHat"]]
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.