View source: R/ExtremeBounds.R
eba | R Documentation |
eba
is used to perform extreme bounds analysis (EBA), a global sensitivity test that examines the robustness of the association between a dependent variable and a variety of possible determinants. The eba
function performs a demanding version of EBA, proposed by Leamer (1985), that focuses on the upper and lower extreme bounds of regression estimates, as well as a more flexible version proposed by Sala-i-Martin (1997). Sala-i-Martin's EBA considers the entire distribution of regression coefficients. For Sala-i-Martin's version of extreme bounds analysis, eba
estimates results for both the normal model (in which regression coefficients are assumed to be normally distributed across models) and the generic model (where no such assumption is made).
eba(formula = NULL, data,
y = NULL, free = NULL, doubtful = NULL, focus = NULL,
k = 0:3, mu = 0, level = 0.95, vif = NULL, exclusive = NULL,
draws = NULL, reg.fun = lm, se.fun = NULL, include.fun = NULL,
weights = NULL, ...)
formula |
a formula that specifies the EBA model that the function will run. Most generally, the formula is of the following format: |
data |
a data frame containing the variables used in the extreme bounds analysis. |
y |
a character string that specifies the dependent variable. |
free |
a character vector that specifies the 'free' variables to be used in the analysis. These variables are included in each regression model. |
doubtful |
a character vector that specifies the 'doubtful' variables to be used in the analysis. These variables will be included, in various combinations, in the estimated regression models. |
focus |
a character vector that specifies the 'focus' variables of the extreme bounds analysis. These are the variables whose robustness the user wants to test. The focus variables must be a subset of the variables included in the argument |
k |
a vector of integers that specifies the number of doubtful variables that will be included in each estimated regression model in addition to the focus variable. Following Levine and Renelt (1992), the default is set to |
mu |
a named vector of numeric values that specifies regression coefficients under the null hypothesis. The names of the vector's elements indicate which variable the null hypothesis coefficients belong to. These null hypothesis coefficient values will be used in all hypothesis testing. Alternatively, the argument |
level |
a numeric value between 0 and 1 that indicates the confidence level to be used in determining the robustness/fragility of determinants. |
vif |
a numeric value that sets the maximum limit on a coefficient's variance inflation factor (VIF), a rule-of-thumb indicator of multicollinearity. Only coefficient estimates whose VIF does not exceed the limit will be considered in the analysis. If |
exclusive |
a list of character vectors, or a formula with sets of mutually exclusive variables separated by |
draws |
a positive integer value that specifies how many regressions |
reg.fun |
a function that estimates the desired regression model. The function must accept arguments |
se.fun |
a function that calculates the standard errors for regression coefficient estimates. The function must accept the regression model object as its first argument, and must return a numeric vector with element names that identify the corresponding regressors. |
include.fun |
a function that determines whether the results from a particular regression model will be included in the analysis. The function must accept the regression model object as its first argument, and must return a logical value. Only regression models for which the function returns a value of TRUE will be included in the extreme bounds analysis. |
weights |
a character string or a function that specifies what weights will be applied to the results from each estimated regression model. The default value of |
... |
additional arguments that will be passed on to the regression function specified by |
If the argument focus
is NULL, it is populated by the content of doubtful
. Conversely, if doubtful
is NULL, it will be filled in with values from focus
. It is thus sufficient to specify only one of doubtful
or focus
to test the robustness of all doubtful variables.
The character strings in arguments y
, free
, doubtful
, focus
and exclusive
can contain model formula operators described in formula
(such as :
, *
, ^
, %in%
), as well as the function I
. In addition, the variables in character strings can be enclosed within other functions: "log(x)"
, for instance, represents the natural logarithm of x
.
The summary
object obtained from the regression function specified in argument reg.fun
should contain a coefficients
matrix component. eba
will collect the coefficient estimates, standard errors, test statistics and p-values from the first, second, third and fourth columns of the coefficients
matrix, respectively. The number of observations is equal to length(x$residuals)
, where x
is the regression model object.
The calculation of weights based on McFadden's likelihood ratio index (see argument weights
) relies on the generic accessor function logLik
. If weights
are based on the regression's R-squared and adjusted R-squared, eba
obtains the values of these statistics from the model object's components r.squared
and adj.r.squared
, respectively.
eba
returns an object of class "eba"
. The corresponding summary
function (i.e., summary.eba
) returns the same object.
An object of class "eba"
is a list containing the following components:
bounds |
a data frame with the results of the extreme bounds analysis. The data frame
|
call |
the matched call. |
coefficients |
a list that contains data frames with selected quantities of interest that emerge from the extreme bounds analysis. This list can also be extracted by calling the generic accessor function
|
mu |
a named vector of regression coefficients under the null hypothesis for each variable. |
level |
a number between 0 and 1 that indicates the confidence level for hypothesis testing. |
ncomb |
total number of doubtful variable combinations that include at least one focus variable. |
nreg |
total number of regressions that were estimated as part of the extreme bounds analysis. When |
nreg.variable |
a named vector containing the the number of estimated regressions that included each variable. |
ncoef.variable |
a named vector containing the the number of estimated coefficients that were used in the extreme bounds analysis. This number can differ from |
regressions |
a list that contains estimation results for each regression that was run as part of the extreme bounds analysis. This list contains several components which store quantities such as coefficient or standard error estimates for each of the estimated regressions. Each of these components is a matrix whose number of rows corresponds to the total number of regressions (equal to
|
Hlavac, Marek (2016). ExtremeBounds: Extreme Bounds Analysis in R. Journal of Statistical Software, 72(9), 1-22. doi: 10.18637/jss.v072.i09.
Marek Hlavac < mhlavac at alumni.princeton.edu >
Research Fellow, Central European Labour Studies Institute (CELSI), Bratislava, Slovakia
McFadden, Daniel L. (1974). Conditional Logit Analysis of Qualitative Choice Behavior. In: P. Zarembka (Ed.), Frontiers in Econometrics, Academic Press: New York, 105-142.
Leamer, Edward E. (1985). Sensitivity Analysis Would Help. American Economic Review, 57(3), 308-313.
Levine, Ross, and David Renelt. (1992). A Sensitivity Analysis of Cross-Country Growth Regressions. American Economic Review, 82(4), 942-963.
Sala-i-Martin, Xavier. (1997). I Just Ran Two Million Regressions. American Economic Review, 87(2), 178-183. doi:10.3386/w6252.
hist.eba
, print.eba
# perform Extreme Bounds Analysis
eba.results <- eba(formula = mpg ~ wt | hp + gear | cyl + disp + drat + qsec + vs + am + carb,
data = mtcars[1:10, ], exclusive = ~ cyl + disp + hp | am + gear)
# The same result can be achieved by running:
# eba.results <- eba(data = mtcars[1:10, ], y = "mpg", free = "wt",
# doubtful = c("cyl", "disp", "hp", "drat", "qsec",
# "vs", "am", "gear", "carb"),
# focus = c("hp", "gear"),
# exclusive = list(c("cyl", "disp", "hp"),
# c("am", "gear")))
# print out results
print(eba.results)
# create histograms
hist(eba.results, variables = c("hp","gear"),
main = c("hp" = "Gross horsepower", "gear" = "Number of forward gears"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.