robust.coef | R Documentation |
This function computes heteroscedasticity-consistent standard errors and
significance values for linear models estimated by using the lm()
function and generalized linear models estimated by using the glm()
function. For linear models the heteroscedasticity-robust F-test is computed
as well. By default, the function uses the HC4 estimator.
robust.coef(model, type = c("HC0", "HC1", "HC2", "HC3", "HC4", "HC4m", "HC5"),
digits = 3, p.digits = 4, write = NULL, append = TRUE, check = TRUE,
output = TRUE)
model |
a fitted model of class |
type |
a character string specifying the estimation type, where
|
digits |
an integer value indicating the number of decimal places
to be used for displaying results. Note that information
criteria and chi-square test statistic are printed with
|
p.digits |
an integer value indicating the number of decimal places to be used for displaying p-values. |
write |
a character string naming a file for writing the output into
either a text file with file extension |
append |
logical: if |
check |
logical: if |
output |
logical: if |
The family of heteroscedasticity-consistent (HC) standard errors estimator for the model parameters of a regression model is based on an HC covariance matrix of the parameter estimates and does not require the assumption of homoscedasticity. HC estimators approach the correct value with increasing sample size, even in the presence of heteroscedasticity. On the other hand, the OLS standard error estimator is biased and does not converge to the proper value when the assumption of homoscedasticity is violated (Darlington & Hayes, 2017).
White (1980) introduced
the idea of HC covariance matrix to econometricians and derived the asymptotically
justified form of the HC covariance matrix known as HC0 (Long & Ervin, 2000).
Simulation studies have shown that the HC0 estimator tends to underestimate the
true variance in small to moderately large samples (N \leq 250
) and in
the presence of leverage observations, which leads to an inflated
type I error risk (e.g., Cribari-Neto & Lima, 2014). The alternative estimators
HC1 to HC5 are asymptotically equivalent to HC0 but include finite-sample corrections,
which results in superior small sample properties compared to the HC0 estimator.
Long and Ervin (2000) recommended routinely using the HC3 estimator regardless
of a heteroscedasticity test. However, the HC3 estimator can be unreliable when
the data contains leverage observations. The HC4 estimator, on
the other hand, performs well with small samples, in the presence of high leverage
observations, and when errors are not normally distributed (Cribari-Neto, 2004).
In summary, it appears that the HC4 estimator performs the best in terms of
controlling the type I and type II error risk (Rosopa, 2013). As opposed to the
findings of Cribari-Neto et al. (2007), the HC5 estimator did not show any
substantial advantages over HC4. Both HC5 and HC4 performed similarly across
all the simulation conditions considered in the study (Ng & Wilcox, 2009).
Note that the F-test of significance on the multiple correlation coefficient R also assumes homoscedasticity of the errors. Violations of this assumption can result in a hypothesis test that is either liberal or conservative, depending on the form and severity of the heteroscedasticity.
Hayes (2007) argued that using a HC estimator instead of assuming homoscedasticity provides researchers with more confidence in the validity and statistical power of inferential tests in regression analysis. Hence, the HC3 or HC4 estimator should be used routinely when estimating regression models. If a HC estimator is not used as the default method of standard error estimation, researchers are advised to at least double-check the results by using an HC estimator to ensure that conclusions are not compromised by heteroscedasticity. However, the presence of heteroscedasticity suggests that the data is not adequately explained by the statistical model of estimated conditional means. Unless heteroscedasticity is believed to be solely caused by measurement error associated with the predictor variable(s), it should serve as warning to the researcher regarding the adequacy of the estimated model.
Returns an object of class misty.object
, which is a list with following
entries:
call |
function call |
type |
type of analysis |
model |
model specified in |
args |
specification of function arguments |
result |
list with results, i.e., |
This function is based on the vcovHC
function from the sandwich
package (Zeileis, Köll, & Graham, 2020) and the functions coeftest
and
waldtest
from the lmtest
package (Zeileis & Hothorn, 2002).
Takuya Yanagida takuya.yanagida@univie.ac.at
Darlington, R. B., & Hayes, A. F. (2017). Regression analysis and linear models: Concepts, applications, and implementation. The Guilford Press.
Cribari-Neto, F. (2004). Asymptotic inference under heteroskedasticity of unknown form. Computational Statistics & Data Analysis, 45, 215-233. https://doi.org/10.1016/S0167-9473(02)00366-3
Cribari-Neto, F., & Lima, M. G. (2014). New heteroskedasticity-robust standard errors for the linear regression model. Brazilian Journal of Probability and Statistics, 28, 83-95.
Cribari-Neto, F., Souza, T., & Vasconcellos, K. L. P. (2007). Inference under heteroskedasticity and leveraged data. Communications in Statistics - Theory and Methods, 36, 1877-1888. https://doi.org/10.1080/03610920601126589
Hayes, A.F, & Cai, L. (2007). Using heteroscedasticity-consistent standard error estimators in OLS regression: An introduction and software implementation. Behavior Research Methods, 39, 709-722. https://doi.org/10.3758/BF03192961
Long, J.S., & Ervin, L.H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. The American Statistician, 54, 217-224. https://doi.org/10.1080/00031305.2000.10474549
Ng, M., & Wilcoy, R. R. (2009). Level robust methods based on the least squares regression estimator. Journal of Modern Applied Statistical Methods, 8, 284-395. https://doi.org/10.22237/jmasm/1257033840
Rosopa, P. J., Schaffer, M. M., & Schroeder, A. N. (2013). Managing heteroscedasticity in general linear models. Psychological Methods, 18(3), 335-351. https://doi.org/10.1037/a0032553
White, H. (1980). A heteroskedastic-consistent covariance matrix estimator and a direct test of heteroskedasticity. Econometrica, 48, 817-838. https://doi.org/10.2307/1912934
Zeileis, A., & Hothorn, T. (2002). Diagnostic checking in regression relationships. R News, 2(3), 7–10. http://CRAN.R-project.org/doc/Rnews/
Zeileis A, Köll S, & Graham N (2020). Various versatile variances: An object-oriented implementation of clustered covariances in R. Journal of Statistical Software, 95(1), 1-36. https://doi.org/10.18637/jss.v095.i01
std.coef
, write.result
dat <- data.frame(x1 = c(3, 2, 4, 9, 5, 3, 6, 4, 5, 6, 3, 5),
x2 = c(1, 4, 3, 1, 2, 4, 3, 5, 1, 7, 8, 7),
x3 = c(0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1),
y1 = c(2, 7, 4, 4, 7, 8, 4, 2, 5, 1, 3, 8),
y2 = c(0, 1, 0, 2, 0, 1, 0, 0, 1, 2, 1, 0))
#-------------------------------------------------------------------------------
# Example 1: Linear model
mod1 <- lm(y1 ~ x1 + x2 + x3, data = dat)
robust.coef(mod1)
#-------------------------------------------------------------------------------
# Example 2: Generalized linear model
mod2 <- glm(y2 ~ x1 + x2 + x3, data = dat, family = poisson())
robust.coef(mod2)
## Not run:
#----------------------------------------------------------------------------
# Write Results
# Example 3a: Write Results into a text file
robust.coef(mod1, write = "Robust_Coef.txt", output = FALSE)
# Example 3b: Write Results into an Excel file
robust.coef(mod1, write = "Robust_Coef.xlsx", output = FALSE)
result <- robust.coef(mod1, output = FALSE)
write.result(result, "Robust_Coef.xlsx")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.