calreg | R Documentation |
Perform a linear regression on an unobserved exposure (X) using a proxy (Z) whose relationship with the exposure has been studied using an external dataset.
calreg(
formula,
data,
fitter = "lm",
calibration,
method = "delta",
n.impute = 50
)
formula |
formula for the linear model. |
data |
[data.frame] dataset used to fit the linear model relating the outcome (Y) to the (unobserved) exposure (X). |
calibration |
a |
method |
[character] Can be |
n.impute |
[integer, >0] Number of imputed dataset to be used. Only relevant when |
Consider a first sample (X_i,Z_i)_{i \in \{1,\ldots,m\}}
that is used to estimate \alpha
in:
X = f(alpha,Z) + \varepsilon_{\alpha}
This is the model to give to the argument calibration
.
The aim is to use a second sample (Y_j,Z_j)_{j \in \{1,\ldots,n\}}
to estimate \beta_1
in:
Y = \beta_0 + \beta_1 X + \varepsilon_{\beta}
The formula of this model should be given to the argument formula
and the dataset to the argument data
.
The exposure \(X\) in the second sample is computed:
based on the conditional expectation of the exposure given the proxy from the first model (method="delta"
).
based on multiple sampling of the coefficients from the first model (method="MI"
).
For each sample an exposure is computed, a linear model is then estimated based on this exposure. The results are then pooled using mice::pool
.
When using the delta method, the uncertainty is decomposed into two parts:
one related to the finite number of observations in the second sample.
one related to the estimation of the parameters in the calibration model, to account for the fact that \(X\) is estimated and not observed.
A data.frame containing the estimates, standard errors, confidence intervals and p-values for each regression coefficient.
The output has an attribute "regression"
containing the fitted linear model (ignoring the uncertainty related to the calibration)
and an attribute "var.add"
representing additional variance-covariance matrix due to the calibration.
library(lava)
n <- 1e2
## linear case
mSim.lin <- lvm(fMRI ~ occ, occ ~ blood)
distribution(mSim.lin, ~blood) <- uniform.lvm(-0.9,2)
set.seed(10)
d1.lin <- sim(mSim.lin, n = n)[,c("occ","blood"),drop=FALSE]
d2.lin <- sim(mSim.lin, n = n)[,c("fMRI","blood"),drop=FALSE]
e1.lin <- lm(occ ~ blood, data = d1.lin)
res.lin <- calreg(fMRI ~ occ, data = d2.lin, calibration = e1.lin)
res.lin
summary(attr(res.lin, "regression"))$coef
## non-linear case
mSim.nlin <- lvm(fMRI ~ occ, occ[mu:0.1] ~ 0*blood)
distribution(mSim.nlin, ~blood) <- uniform.lvm(-0.9,3)
constrain(mSim.nlin, mu~blood) <- function(x){2*x/(1+x)}
set.seed(10)
d1.nlin <- sim(mSim.nlin, n = n)[,c("occ","blood"),drop=FALSE]
d2.nlin <- sim(mSim.nlin, n = n)[,c("fMRI","blood"),drop=FALSE]
## gg <- ggplot(d1.nlin, aes(x = blood)) + geom_point(aes(y = occ))
e1.nlin <- nls(occ ~ (occmax * blood)/(EC + blood), data = d1.nlin,
start = list(occmax = 1, EC = 1))
d1.nlin$fit <- fitted(e1.nlin)
## gg + geom_line(data = d1.nlin, aes(y = fit), color = "red")
res.nlin <- calreg(fMRI ~ occ, data = d2.nlin, calibration = e1.nlin)
res.nlin
summary(attr(res.nlin, "regression"))$coef
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.