sfacross | R Documentation |
sfacross
is a symbolic formula-based function for the
estimation of stochastic frontier models in the case of cross-sectional or
pooled cross-sectional data, using maximum (simulated) likelihood - M(S)L.
The function accounts for heteroscedasticity in both one-sided and two-sided error terms as in Reifschneider and Stevenson (1991), Caudill and Ford (1993), Caudill et al. (1995) and Hadri (1999), but also heterogeneity in the mean of the pre-truncated distribution as in Kumbhakar et al. (1991), Huang and Liu (1994) and Battese and Coelli (1995).
Ten distributions are possible for the one-sided error term and eleven optimization algorithms are available.
The truncated normal - normal distribution with scaling property as in Wang and Schmidt (2002) is also implemented.
sfacross(
formula,
muhet,
uhet,
vhet,
logDepVar = TRUE,
data,
subset,
weights,
wscale = TRUE,
S = 1L,
udist = "hnormal",
scaling = FALSE,
start = NULL,
method = "bfgs",
hessianType = 1L,
simType = "halton",
Nsim = 100,
prime = 2L,
burn = 10,
antithetics = FALSE,
seed = 12345,
itermax = 2000,
printInfo = FALSE,
tol = 1e-12,
gradtol = 1e-06,
stepmax = 0.1,
qac = "marquardt"
)
## S3 method for class 'sfacross'
print(x, ...)
## S3 method for class 'sfacross'
bread(x, ...)
## S3 method for class 'sfacross'
estfun(x, ...)
formula |
A symbolic description of the model to be estimated based on
the generic function |
muhet |
A one-part formula to consider heterogeneity in the mean of the pre-truncated distribution (see section ‘Details’). |
uhet |
A one-part formula to consider heteroscedasticity in the one-sided error variance (see section ‘Details’). |
vhet |
A one-part formula to consider heteroscedasticity in the two-sided error variance (see section ‘Details’). |
logDepVar |
Logical. Informs whether the dependent variable is logged
( |
data |
The data frame containing the data. |
subset |
An optional vector specifying a subset of observations to be used in the optimization process. |
weights |
An optional vector of weights to be used for weighted
log-likelihood. Should be |
wscale |
Logical. When |
S |
If |
udist |
Character string. Default =
|
scaling |
Logical. Only when |
start |
Numeric vector. Optional starting values for the maximum likelihood (ML) estimation. |
method |
Optimization algorithm used for the estimation. Default =
|
hessianType |
Integer. If |
simType |
Character string. If |
Nsim |
Number of draws for MSL. Default 100. |
prime |
Prime number considered for Halton and Generalized-Halton
draws. Default = |
burn |
Number of the first observations discarded in the case of Halton
draws. Default = |
antithetics |
Logical. Default = |
seed |
Numeric. Seed for the random draws. |
itermax |
Maximum number of iterations allowed for optimization.
Default = |
printInfo |
Logical. Print information during optimization. Default =
|
tol |
Numeric. Convergence tolerance. Default = |
gradtol |
Numeric. Convergence tolerance for gradient. Default =
|
stepmax |
Numeric. Step max for |
qac |
Character. Quadratic Approximation Correction for |
x |
an object of class sfacross (returned by the function
|
... |
additional arguments of frontier are passed to sfacross; additional arguments of the print, bread, estfun, nobs methods are currently ignored. |
The stochastic frontier model for the cross-sectional data is defined as:
y_i = \alpha + \mathbf{x_i^{\prime}}\bm{\beta} + v_i - Su_i
with
\epsilon_i = v_i -Su_i
where i
is the observation, y
is the
output (cost, revenue, profit), \mathbf{x}
is the vector of main explanatory
variables (inputs and other control variables), u
is the one-sided
error term with variance \sigma_{u}^2
, and v
is the two-sided
error term with variance \sigma_{v}^2
.
S = 1
in the case of production (profit) frontier function and
S = -1
in the case of cost frontier function.
The model is estimated using maximum likelihood (ML) for most distributions
except the Gamma, Weibull and log-normal distributions for which maximum
simulated likelihood (MSL) is used. For this latter, several draws can be
implemented namely Halton, Generalized Halton, Sobol and uniform. In the
case of uniform draws, antithetics can also be computed: first Nsim/2
draws are obtained, then the Nsim/2
other draws are obtained as
counterpart of one (1-draw
).
To account for heteroscedasticity in the variance parameters of the error
terms, a single part (right) formula can also be specified. To impose the
positivity to these parameters, the variances are modelled as:
\sigma^2_u = \exp{(\bm{\delta}'\mathbf{Z}_u)}
or \sigma^2_v =
\exp{(\bm{\phi}'\mathbf{Z}_v)}
, where \mathbf{Z}_u
and \mathbf{Z}_v
are the heteroscedasticity
variables (inefficiency drivers in the case of \mathbf{Z}_u
) and \bm{\delta}
and \bm{\phi}
the coefficients. In the case of heterogeneity in the
truncated mean \mu
, it is modelled as \mu=\bm{\omega}'\mathbf{Z}_{\mu}
. The
scaling property can be applied for the truncated normal distribution:
u \sim h(\mathbf{Z}_u, \delta)u
where u
follows a truncated normal
distribution N^+(\tau, \exp{(cu)})
.
In the case of the truncated normal distribution, the convolution of
u_i
and v_i
is:
f(\epsilon_i)=\frac{1}{\sqrt{\sigma_u^2 +
\sigma_v^2}}\phi\left(\frac{S\epsilon_i + \mu}{\sqrt{
\sigma_u^2 + \sigma_v^2}}\right)\Phi\left(\frac{
\mu_{i*}}{\sigma_*}\right)\Big/\Phi\left(\frac{
\mu}{\sigma_u}\right)
where
\mu_{i*}=\frac{\mu\\\sigma_v^2 -
S\epsilon_i\sigma_u^2}{\sigma_u^2 + \sigma_v^2}
and
\sigma_*^2 = \frac{\sigma_u^2
\sigma_v^2}{\sigma_u^2 + \sigma_v^2}
In the case of the half normal distribution the convolution is obtained by
setting \mu=0
.
sfacross
allows for the maximization of weighted log-likelihood.
When option weights
is specified and wscale = TRUE
, the weights
are scaled as:
new_{weights} = sample_{size} \times
\frac{old_{weights}}{\sum(old_{weights})}
For complex problems, non-gradient methods (e.g. nm
or sann
)
can be used to warm start the optimization and zoom in the neighborhood of
the solution. Then a gradient-based methods is recommended in the second
step. In the case of sann
, we recommend to significantly increase the
iteration limit (e.g. itermax = 20000
). The Conjugate Gradient
(cg
) can also be used in the first stage.
A set of extractor functions for fitted model objects is available for
objects of class 'sfacross'
including methods to the generic functions
print
,
summary
, coef
,
fitted
,
logLik
,
residuals
,
vcov
,
efficiencies
,
ic
,
marginal
,
skewnessTest
,
estfun
and
bread
(from the sandwich package),
lmtest::coeftest()
(from the lmtest package).
sfacross
returns a list of class 'sfacross'
containing the following elements:
call |
The matched call. |
formula |
The estimated model. |
S |
The argument |
typeSfa |
Character string. 'Stochastic Production/Profit Frontier, e =
v - u' when |
Nobs |
Number of observations used for optimization. |
nXvar |
Number of explanatory variables in the production or cost frontier. |
nmuZUvar |
Number of variables explaining heterogeneity in the
truncated mean, only if |
scaling |
The argument |
logDepVar |
The argument |
nuZUvar |
Number of variables explaining heteroscedasticity in the one-sided error term. |
nvZVvar |
Number of variables explaining heteroscedasticity in the two-sided error term. |
nParm |
Total number of parameters estimated. |
udist |
The argument |
startVal |
Numeric vector. Starting value for M(S)L estimation. |
dataTable |
A data frame (tibble format) containing information on data
used for optimization along with residuals and fitted values of the OLS and
M(S)L estimations, and the individual observation log-likelihood. When
|
olsParam |
Numeric vector. OLS estimates. |
olsStder |
Numeric vector. Standard errors of OLS estimates. |
olsSigmasq |
Numeric. Estimated variance of OLS random error. |
olsLoglik |
Numeric. Log-likelihood value of OLS estimation. |
olsSkew |
Numeric. Skewness of the residuals of the OLS estimation. |
olsM3Okay |
Logical. Indicating whether the residuals of the OLS estimation have the expected skewness. |
CoelliM3Test |
Coelli's test for OLS residuals skewness. (See Coelli, 1995). |
AgostinoTest |
D'Agostino's test for OLS residuals skewness. (See D'Agostino and Pearson, 1973). |
isWeights |
Logical. If |
optType |
Optimization algorithm used. |
nIter |
Number of iterations of the ML estimation. |
optStatus |
Optimization algorithm termination message. |
startLoglik |
Log-likelihood at the starting values. |
mlLoglik |
Log-likelihood value of the M(S)L estimation. |
mlParam |
Parameters obtained from M(S)L estimation. |
gradient |
Each variable gradient of the M(S)L estimation. |
gradL_OBS |
Matrix. Each variable individual observation gradient of the M(S)L estimation. |
gradientNorm |
Gradient norm of the M(S)L estimation. |
invHessian |
Covariance matrix of the parameters obtained from the M(S)L estimation. |
hessianType |
The argument |
mlDate |
Date and time of the estimated model. |
simDist |
The argument |
Nsim |
The argument |
FiMat |
Matrix of random draws used for MSL, only if |
For the Halton draws, the code is adapted from the mlogit package.
Aigner, D., Lovell, C. A. K., and Schmidt, P. 1977. Formulation and estimation of stochastic frontier production function models. Journal of Econometrics, 6(1), 21–37.
Battese, G. E., and Coelli, T. J. 1995. A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empirical Economics, 20(2), 325–332.
Caudill, S. B., and Ford, J. M. 1993. Biases in frontier estimation due to heteroscedasticity. Economics Letters, 41(1), 17–20.
Caudill, S. B., Ford, J. M., and Gropper, D. M. 1995. Frontier estimation and firm-specific inefficiency measures in the presence of heteroscedasticity. Journal of Business & Economic Statistics, 13(1), 105–111.
Coelli, T. 1995. Estimators and hypothesis tests for a stochastic frontier function - a Monte-Carlo analysis. Journal of Productivity Analysis, 6:247–268.
D'Agostino, R., and E.S. Pearson. 1973. Tests for departure from normality.
Empirical results for the distributions of b_2
and \sqrt{b_1}
.
Biometrika, 60:613–622.
Greene, W. H. 2003. Simulated likelihood estimation of the normal-Gamma stochastic frontier function. Journal of Productivity Analysis, 19(2-3), 179–190.
Hadri, K. 1999. Estimation of a doubly heteroscedastic stochastic frontier cost function. Journal of Business & Economic Statistics, 17(3), 359–363.
Hajargasht, G. 2015. Stochastic frontiers with a Rayleigh distribution. Journal of Productivity Analysis, 44(2), 199–208.
Huang, C. J., and Liu, J.-T. 1994. Estimation of a non-neutral stochastic frontier production function. Journal of Productivity Analysis, 5(2), 171–180.
Kumbhakar, S. C., Ghosh, S., and McGuckin, J. T. 1991) A generalized production frontier approach for estimating determinants of inefficiency in U.S. dairy farms. Journal of Business & Economic Statistics, 9(3), 279–286.
Li, Q. 1996. Estimating a stochastic production frontier when the adjusted error is symmetric. Economics Letters, 52(3), 221–228.
Meeusen, W., and Vandenbroeck, J. 1977. Efficiency estimation from Cobb-Douglas production functions with composed error. International Economic Review, 18(2), 435–445.
Migon, H. S., and Medici, E. V. 2001. Bayesian hierarchical models for stochastic production frontier. Lacea, Montevideo, Uruguay.
Nguyen, N. B. 2010. Estimation of technical efficiency in stochastic frontier analysis. PhD dissertation, Bowling Green State University, August.
Papadopoulos, A. 2021. Stochastic frontier models using the generalized exponential distribution. Journal of Productivity Analysis, 55:15–29.
Reifschneider, D., and Stevenson, R. 1991. Systematic departures from the frontier: A framework for the analysis of firm inefficiency. International Economic Review, 32(3), 715–723.
Stevenson, R. E. 1980. Likelihood Functions for Generalized Stochastic Frontier Estimation. Journal of Econometrics, 13(1), 57–66.
Tsionas, E. G. 2007. Efficiency measurement with the Weibull stochastic frontier. Oxford Bulletin of Economics and Statistics, 69(5), 693–706.
Wang, K., and Ye, X. 2020. Development of alternative stochastic frontier models for estimating time-space prism vertices. Transportation.
Wang, H.J., and Schmidt, P. 2002. One-step and two-step estimation of the effects of exogenous variables on technical efficiency levels. Journal of Productivity Analysis, 18:129–144.
Wang, J. 2012. A normal truncated skewed-Laplace model in stochastic frontier analysis. Master thesis, Western Kentucky University, May.
print
for printing sfacross
object.
summary
for creating and printing
summary results.
coef
for extracting coefficients of the
estimation.
efficiencies
for computing
(in-)efficiency estimates.
fitted
for extracting the fitted frontier
values.
ic
for extracting information criteria.
logLik
for extracting log-likelihood
value(s) of the estimation.
marginal
for computing marginal effects of
inefficiency drivers.
residuals
for extracting residuals of the
estimation.
skewnessTest
for conducting residuals
skewness test.
vcov
for computing the variance-covariance
matrix of the coefficients.
bread
for bread for sandwich estimator.
estfun
for gradient extraction for each
observation.
skewnessTest
for implementing skewness test.
## Using data on fossil fuel fired steam electric power generation plants in the U.S.
# Translog (cost function) half normal with heteroscedasticity
tl_u_h <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs')
summary(tl_u_h)
# Translog (cost function) truncated normal with heteroscedasticity
tl_u_t <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bhhh')
summary(tl_u_t)
# Translog (cost function) truncated normal with scaling property
tl_u_ts <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, S = -1,
scaling = TRUE, method = 'mla')
summary(tl_u_ts)
## Using data on Philippine rice producers
# Cobb Douglas (production function) generalized exponential, and Weibull
# distributions
cb_p_ge <- sfacross(formula = log(PROD) ~ log(AREA) + log(LABOR) + log(NPK) +
log(OTHER), udist = 'genexponential', data = ricephil, S = 1, method = 'bfgs')
summary(cb_p_ge)
## Using data on U.S. electric utility industry
# Cost frontier Gamma distribution
tl_u_g <- sfacross(formula = log(cost/fprice) ~ log(output) + I(log(output)^2) +
I(log(lprice/fprice)) + I(log(cprice/fprice)), udist = 'gamma', uhet = ~ 1,
data = electricity, S = -1, method = 'bfgs', simType = 'halton', Nsim = 200,
hessianType = 2)
summary(tl_u_g)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.