Home

/

CRAN

/

eivtools

/

eivreg: Errors-in-variables (EIV) linear regression

eivreg: Errors-in-variables (EIV) linear regression
In eivtools: Measurement Error Modeling Tools

Description Usage Arguments Details Value Author(s) References See Also Examples

Fits errors-in-variables (EIV) linear regression given specified reliabilities, or a specified variance/covariance matrix for the measurement errors. For either case, it computes robust standard error estimates that allow for weighting and/or clustering.

eivreg(formula, data, subset, weights, na.action, method = "qr",
model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = FALSE,
contrasts = NULL, reliability = NULL, Sigma_error = NULL,
cluster_varname = NULL, df_adj = FALSE, stderr = TRUE, offset,
...)

`formula, data, subset, weights, na.action, method, model, x, y, qr`	See documentation for `lm`.
`singular.ok, contrasts, offset, ...`	See documentation for `lm`.
`reliability`	Named numeric vector giving the reliability for each error-prone covariate. If left `NULL`, `Sigma_error` must be specified.
`Sigma_error`	Named numeric matrix giving the variance/covariance matrix of the measurement errors for the error-prone covariate(s). If left `NULL`, `reliability` must be specified.
`cluster_varname`	A character variable providing the name of a variable in `data` that will be used as a clustering variable for robust standard error computation.
`df_adj`	Logical (default FALSE); if TRUE, the estimated variance/covariance matrix of the regression parameters is multiplied by `N/(N-p)`, where `N` is the number of observations used in the model fit and `p` is the number of regression parameters (including an intercept, if any).
`stderr`	Logical (default TRUE); if FALSE, does not compute estimated variance/covariance matrix of the regression parameters.

Theory

The EIV estimator applies when one wishes to estimate the parameters of a linear regression of Y on (X,Z), but covariates (W,Z) are instead observed, where W = X + U for mean zero measurement error U. Additional assumptions are required about U for consistent estimation; see references for details.

The standard EIV estimator of the regression coefficients is (Q'Q - S)\^{-1}Q'Y, where Q is the design matrix formed from (W,Z) and S is a matrix that adjusts Q'Q to account for elements that are distorted due to measurement error. The value of S depends on whether reliability or Sigma_error is specified. When Sigma_error is specified, S is known. When reliability is specified, S must be estimated using the marginal variances of the observed error-prone covariates.

The estimated regression coefficients are solutions to a system of estimating equations, and both the system of equations and the solutions depend on whether reliability or Sigma_error is specified. For each of these two cases, standard errors for the estimated regression coefficients are computed using standard results from M-estimation; see references. For either case, adjustments for clustering are provided if specified.

Syntax Details

Exactly one of reliability or Sigma_error must be specified in the call. Sigma_error need not be diagonal in the case of correlated measurement error across multiple error-prone covariates.

Error-prone variables must be included as linear main effects only; the current version of the code does not allow interactions among error-prone covariates, interactions of error-prone covariates with error-free covariates, or nonlinear functions of error-prone covariates. The error-prone covariates cannot be specified with any construction involving I().

The current version does not allow singular.ok=TRUE.

It is strongly encouraged to use the data argument to pass a dataframe containing all variables to be used in the regression, rather than using a matrix on the right hand side of the regression formula. In addition, if cluster_varname is specified, everything including the clustering variable must be passed as data.

If weights is specified, a weighted version of the EIV estimator is computed using operations analogous to weighted least squares in linear regression, and a standard error for this weighted estimator is computed. Weights must be positive and will be normalized inside the function to sum to the number of observations used to fit the model. Cases with missing weights will get dropped just like cases with missing covariates.

Different software packages that compute robust standard errors make different choices about degrees-of-freedom adjustments intended to improve small-sample coverage properties. The df_adj argument will inflate the estimated variance/covariance matrix of the estimated regression coefficients by N/(N-p); see Wooldridge (2002, p. 57). In addition, if cluster_varname is specified, the estimated variance/covariance matrix will be inflated by M/(M-1) where M is the number of unique clusters present in the estimation sample.

An list object of class eivlm with the following components:

`coefficients`	Estimated regression coefficients from EIV model.
`residuals`	Residuals from fitted EIV model.
`rank`	Column rank of regression design matrix.
`fitted.values`	Fitted values from EIV model.
`N`	Number of observations used in fitted model.
`Sigma_error`	The measurement error covariance matrix, if supplied.
`reliability`	The vector of reliabilities, if supplied.
`relnames`	The names of the error-prone covariates.
`XpX_adj`	The cross-product matrix of the regression, adjusted for measurement error.
`varYXZ`	The maximum likelihood estimate of the covariance matrix of the outcome Y, the latent covariates X and the observed, error-free covariates Z.
`latent_resvar`	A degrees-of-freedom adjusted estimate of the residual variance of the latent regression. NOTE: this not an estimate of the residual variance of the regression on the observed covariates (W,Z), but rather an estimate of the residual variance of the regression on (X,Z).
`vcov`	The estimated variance/covariance matrix of the regression coefficients.
`cluster_varname, cluster_values, cluster_num`	If `cluster_varname` is specified, it is returned in the object, along with `cluster_values` providing the actual values of the clustering variable for the cases used in the fitted model, and `cluster_num`, the number of unique such clusters.
`OTHER`	The object also includes components `assign`, `df.residual`, `xlevels`, `call`, `terms`, `model` and other optional components such as `weights`, depending on the call; see `lm`. In addition, the object includes components `unadj_coefficients`, `unadj_fitted.values`, `unadj_residuals`, `unadj_effects`, and `unadj_qr` that are computed from the unadjusted regression model that ignores measurement error; see `lm`.

J.R. Lockwood jrlockwood@ets.org modified the lm function to adapt it for EIV regression.

Carroll R.J, Ruppert D., Stefanski L.A. and Crainiceanu C.M. (2006). Measurement Error in Nonlinear Models: A Modern Perspective (2nd edition). London: Chapman & Hall.

Fuller W. (2006). Measurement Error Models (2nd edition). New York: John Wiley & Sons.

Stefanksi L.A. and Boos D.B. (2002). “The calculus of M-estimation,” The American Statistician 56(1):29-38.

Wooldridge J. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, MA: MIT Press.

lm, summary.eivlm, deconv_npmle

set.seed(1001)
## simulate data with covariates x1, x2 and z.
.n    <- 1000
.d    <- data.frame(x1 = rnorm(.n))
.d$x2 <- sqrt(0.5)*.d$x1 + rnorm(.n, sd=sqrt(0.5))
.d$z  <- as.numeric(.d$x1 + .d$x2 > 0)

## generate outcome
## true regression parameters are c(2,1,1,-1)
.d$y  <- 2.0 + 1.0*.d$x1 + 1.0*.d$x2 - 1.0*.d$z + rnorm(.n)

## generate error-prone covariates w1 and w2
Sigma_error <- diag(c(0.20, 0.30))
dimnames(Sigma_error) <- list(c("w1","w2"), c("w1","w2"))
.d$w1 <- .d$x1 + rnorm(.n, sd = sqrt(Sigma_error["w1","w1"]))
.d$w2 <- .d$x2 + rnorm(.n, sd = sqrt(Sigma_error["w2","w2"]))

## fit EIV regression specifying known measurement error covariance matrix
.mod1 <- eivreg(y ~ w1 + w2 + z, data = .d, Sigma_error = Sigma_error)
print(class(.mod1))
.tmp <- summary(.mod1)
print(class(.tmp))
print(.tmp)

## fit EIV regression specifying known reliabilities.  Note that
## point estimator is slightly different from .mod1 because
## the correction matrix S must be estimated when the reliability
## is known.
.lambda <- c(1,1) / (c(1,1) + diag(Sigma_error))
.mod2 <- eivreg(y ~ w1 + w2 + z, data = .d, reliability = .lambda)
print(summary(.mod2))

Loading required package: R2jags
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.3.0
Loaded modules: basemod,bugs

Attaching package: ‘R2jags’

The following object is masked from ‘package:coda’:

    traceplot

[1] "eivlm"
[1] "summary.eivlm"

Call:
eivreg(formula = y ~ w1 + w2 + z, data = .d, Sigma_error = Sigma_error)

Error Covariance Matrix
    w1  w2
w1 0.2 0.0
w2 0.0 0.3

Residuals:
    Min      1Q  Median      3Q     Max 
-4.1245 -0.8115  0.0029  0.8321  3.4119 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  2.01952    0.09076  22.250  < 2e-16 ***
w1           1.17119    0.08219  14.250  < 2e-16 ***
w2           1.02401    0.08792  11.647  < 2e-16 ***
z           -1.16054    0.15803  -7.344 4.32e-13 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Number of observations used: 1000 
Latent residual standard deviation: 0.9547 
Latent R-squared: 0.7006, (df-adjusted: 0.6994)

EIV-Adjusted vs Unadjusted Coefficients:
            Adjusted Unadjusted
(Intercept)    2.020     1.5446
w1             1.171     0.8532
w2             1.024     0.6447
z             -1.161    -0.2111


Call:
eivreg(formula = y ~ w1 + w2 + z, data = .d, reliability = .lambda)

Reliability:
    w1     w2 
0.8333 0.7692 

Residuals:
    Min      1Q  Median      3Q     Max 
-4.0520 -0.7997  0.0259  0.8106  3.2725 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  1.96106    0.08870  22.110  < 2e-16 ***
w1           1.16843    0.07692  15.189  < 2e-16 ***
w2           0.93516    0.07853  11.909  < 2e-16 ***
z           -1.04324    0.15482  -6.738 2.71e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Number of observations used: 1000 
Latent residual standard deviation: 0.9811 
Latent R-squared: 0.6838, (df-adjusted: 0.6825)

EIV-Adjusted vs Unadjusted Coefficients:
            Adjusted Unadjusted
(Intercept)   1.9611     1.5446
w1            1.1684     0.8532
w2            0.9352     0.6447
z            -1.0432    -0.2111

eivtools documentation built on May 1, 2019, 9:52 p.m.

eivtools index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

eivtools
Measurement Error Modeling Tools

eivreg: Errors-in-variables (EIV) linear regression
In eivtools: Measurement Error Modeling Tools

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to eivreg in eivtools...

R Package Documentation

Browse R Packages

We want your feedback!

eivtools Measurement Error Modeling Tools

eivreg: Errors-in-variables (EIV) linear regression In eivtools: Measurement Error Modeling Tools

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to eivreg in eivtools...

R Package Documentation

Browse R Packages

We want your feedback!

eivtools
Measurement Error Modeling Tools

eivreg: Errors-in-variables (EIV) linear regression
In eivtools: Measurement Error Modeling Tools