fxnCC_LinReg: Constraint maximum likelihood (CML) method for linear...

Description Usage Arguments Value References Examples

View source: R/Rpackage_Tian.R

Description

Constraint maximum likelihood (CML) method for linear regression (continuous outcome Y)

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
fxnCC_LinReg(
  p,
  q,
  YInt,
  XInt,
  BInt,
  betaHatExt,
  gammaHatInt,
  n,
  tol,
  maxIter,
  factor
)

Arguments

p

total number of X covariates including the intercept (i.e. p=ncol(X)+1)

q

total number of covariates including the intercept (i.e. q=ncol(X)+ncol(B)+1)

YInt

Outcome vector

XInt

X covariates that are used in the external models - Do not include intercept

BInt

Newly added B covariates that are not included in the external models

betaHatExt

External parameter estimates of the reduced model

gammaHatInt

Full model parameter estimates using the internal data only

n

internal data sample size

tol

convergence criteria e.g. 1e-6

maxIter

Maximum number of iterations to reach convergence e.g. 400

factor

the step-halving factor between 0 and 1, if factor=1 then newton-raphson method; decrease if algorithm cannot converge given the maximum iterations

Value

gammaHat, in the order (intercept, XInt, BInt)

References

Chatterjee, N., Chen, Y.-H., P.Maas and Carroll, R. J. (2016). Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources. Journal of the American Statistical Association 111, 107-117.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# Full model: Y|X, B
# Reduced model: Y|X
# X,B follows normal distribution with mean zero, variance one and correlation 0.3
# Y|X, B follows N(-1-0.5X+0.5B, 1)
set.seed(2333)
n = 800
data.n = data.frame(matrix(ncol = 3, nrow = n))
colnames(data.n) = c('Y', 'X', 'B')
data.n[,c('X', 'B')] = MASS::mvrnorm(n, rep(0,2), diag(0.7,2)+0.3)
data.n$Y = rnorm(n, -1 - 0.5*data.n$X + 0.5*data.n$B, 1)

# Generate the beta estimates from the external reduced model:
# generate a data of size m from the full model first, then fit the reduced regression 
# to obtain the beta estiamtes and the corresponsing estimated variance 
m = 30000
data.m = data.frame(matrix(ncol = 3, nrow = m))
names(data.m) = c('Y', 'X', 'B')
data.m[,c('X', 'B')] = MASS::mvrnorm(m, rep(0,2), diag(0.7,2)+0.3)
data.m$Y = rnorm(m, -1 - 0.5*data.m$X + 0.5*data.m$B, 1)

#fit Y|X to obtain the external beta estimates, save the beta estiamtes and the 
# corresponsing estimated variance 
fit.E = lm(Y ~ X, data = data.m)
beta.E = coef(fit.E)
names(beta.E) = c('int', 'X')
V.E = vcov(fit.E)

#get full model estimate from direct regression using the internal data only
fit.gamma.I = lm(Y ~ X + B, data = data.n)
gamma.I = coef(fit.gamma.I)

#Get CML estimates
gamma.CML = fxnCC_LinReg(p=2, 
                         q=3, 
                         YInt=data.n$Y, 
                         XInt=data.n[,'X'], 
                         BInt=data.n[,'B'], 
                         betaHatExt=beta.E, 
                         gammaHatInt=gamma.I, 
                         n=nrow(data.n), 
                         tol=1e-8, 
                         maxIter=400,
                         factor=1)[["gammaHat"]]

MetaIntegration documentation built on March 18, 2021, 1:06 a.m.