Stochastic Frontier Analysis
Description
Maximum Likelihood Estimation of Stochastic Frontier Production and Cost Functions. Two specifications are available: the error components specification with timevarying efficiencies (Battese and Coelli 1992) and a model specification in which the firm effects are directly influenced by a number of variables (Battese and Coelli 1995). This R package uses the Fortran source code of Frontier 4.1 (Coelli 1996).
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13  sfa( formula, data = sys.frame( sys.parent() ),
ineffDecrease = TRUE, truncNorm = FALSE,
timeEffect = FALSE, startVal = NULL,
tol = 0.00001, maxit = 1000, muBound = 2, bignum = 1.0E+16,
searchStep = 0.00001, searchTol = 0.001, searchScale = NA,
gridSize = 0.1, gridDouble = TRUE,
restartMax = 10, restartFactor = 0.999, printIter = 0 )
frontier( yName, xNames = NULL, zNames = NULL, data,
zIntercept = FALSE, ... )
## S3 method for class 'frontier'
print( x, digits = NULL, ... )

Arguments
formula 
a symbolic description of the model to be estimated; it can be either a (usual) onepart or a twopart formula (see section ‘Details’). 
data 
a (panel) data frame that contains the data;
if 
ineffDecrease 
logical. If 
truncNorm 
logical. If 
timeEffect 
logical. If 
startVal 
numeric vector. Optional starting values for the ML estimation. 
tol 
numeric. Convergence tolerance (proportional). 
maxit 
numeric. Maximum number of iterations permitted. 
muBound 
numeric. Bounds on the parameter mu (see ‘details’ section). 
bignum 
numeric. Used to set bounds on densities and distributions. 
searchStep 
numeric. Size of the first step in the Coggin unidimensional search procedure done each iteration to determine the optimal step length for the next iteration (see Himmelblau 1972). 
searchTol 
numeric. Tolerance used in the Coggin unidimensional search procedure done each iteration to determine the optimal step length for the next iteration (see Himmelblau 1972). 
searchScale 
logical or 
gridSize 
numeric. The size of the increment in the first phase grid search on gamma. 
gridDouble 
logical. If 
restartMax 
integer: maximum number of restarts of the search procedure when it cannot find a parameter vector that results in a loglikelihood value larger than the loglikelihood value of the initial parameters. 
restartFactor 
numeric scalar: if the search procedure
cannot find a parameter vector that results in a loglikelihood value
larger than the loglikelihood value of the initial parameters,
the initial values
(provided by argument 
printIter 
numeric. Print info every 
yName 
string: name of the endogenous variable. 
xNames 
a vector of strings containing the names of the X variables (exogenous variables of the production or cost function). 
zNames 
a vector of strings containing the names of the Z variables (variables explaining the efficiency level). 
zIntercept 
logical. If 
x 
an object of class 
digits 
a nonnull value for ‘digits’ specifies
the minimum number of significant digits to be printed in values.
The default, 
... 
additional arguments of 
Details
Function frontier
is a wrapper function
that calls sfa
for the estimation.
The two functions differ only in the user interface;
function frontier
has the “old” user interface
and is kept to maintain compatibility with older versions
of the frontier
package.
One can use functions sfa
and frontier
to calculate the log likelihood value for a given model,
a given data set, and given parameters
by using the argument startVal
to specify the parameters
and using the other arguments to specify the model and the data.
The log likelihood value can then be retrieved by
the logLik
method
with argument which
set to "start"
.
Setting argument maxit
to 0
avoids the
(eventually timeconsuming) ML estimation and allows
to retrieve the log likelihood value
with the logLik
method
without further arguments.
The frontier
function uses the Fortran source code of
Tim Coelli's software FRONTIER 4.1
(http://www.uq.edu.au/economics/cepa/frontier.htm)
and hence, provides the same features as FRONTIER 4.1.
A comprehensive documentation of FRONTIER 4.1 is available
in the file Front41.pdf
that is included in the archive FRONT41xp1.zip
,
which is available at
http://www.uq.edu.au/economics/cepa/frontier.htm.
It is recommended to read this documentation,
because the frontier
function is based on the FRONTIER 4.1 software.
If argument formula
of sfa
is a (usual) onepart formula
(or argument zNames
of frontier
is NULL
),
an ‘Error Components Frontier’ (ECF, see Battese and Coelli 1992)
is estimated.
If argument formula
is a twopart formula
(or zNames
is not NULL
),
an ‘Efficiency Effects Frontier’ (EEF, see Battese and Coelli 1995)
is estimated.
In this case, the first part of the formula
(i.e. the part before the “” symbol)
is used to explain the endogenous variable directly (X variables),
while the second part of the formula
(i.e. the part after the “” symbol)
is used to explain the efficiency levels (Z variables).
Generally, there should be no reason for estimating an EEF
without Z variables,
but this can done by setting the second part of argument formula
to 1
(with Z intercept) or  1
(without Z intercept)
(or by setting argument zNames
) to NA
).
In case of an Error Components Frontier (ECF)
with the inefficiency terms u following a
truncated normal distribution with mean mu,
argument muBound
can be used to restrict mu
to be in the interval +/muBound
* sigma_u,
where sigma_u is the standard deviation of u.
If muBound
is infinity, zero, or negative,
no bounds on mu are imposed.
Value
sfa
and frontier
return a list of class frontier
containing following elements:
modelType 
integer. A ‘1’ denotes an ‘Error Components Frontier’ (ECF); a ‘2’ denotes an ‘Efficiency Effects Frontier’ (EFF). 
ineffDecrease 
logical. Argument 
nn 
number of crosssections. 
nt 
number of time periods. 
nob 
number of observations in total. 
nb 
number of regressor variables (Xs). 
truncNorm 
logical. Argument 
zIntercept 
logical. Argument 
timeEffect 
logical. Argument 
printIter 
numeric. Argument 
searchScale 
numeric. Argument 
tol 
numeric. Argument 
searchTol 
numeric. Argument 
bignum 
numeric. Argument 
searchStep 
numeric. Argument 
gridDouble 
logical. Argument 
gridSize 
numeric. Argument 
maxit 
numeric. Argument 
muBound 
numeric. Argument 
restartMax 
numeric. Argument 
restartFactor 
numeric. Argument 
nRestart 
numeric. Number of restarts of the search procedure when it cannot find a parameter vector that results in a loglikelihood value larger than the loglikelihood value of the initial parameters. 
startVal 
numeric vector. Argument 
call 
the matched call. 
dataTable 
matrix. Data matrix sent to Frontier 4.1. 
olsParam 
numeric vector. OLS estimates. 
olsStdEr 
numeric vector. Standard errors of OLS estimates. 
olsLogl 
numeric. Log likelihood value of OLS estimation. 
olsResid 
numeric vector. Residuals of the OLS estimation. 
olsSkewness 
numeric. Skewness of the residuals of the OLS estimation. 
olsSkewnessOkay 
logical. Indicating if the residuals of the OLS estimation have the expected skewness. 
gridParam 
numeric vector. Parameters obtained from the grid search (if no starting values were specified). 
gridLogl 
numeric. Log likelihood value of the parameters obtained from the grid search (only if no starting values were specified). 
startLogl 
numeric. Log likelihood value of the starting values for the parameters (only if starting values were specified). 
mleParam 
numeric vector. Parameters obtained from ML estimation. 
mleCov 
matrix. Covariance matrix of the parameters obtained from the OLS estimation. 
mleLogl 
numeric. Log likelihood value of the ML estimation. 
nIter 
numeric. Number of iterations of the ML estimation. 
code 
integer indication the reason for determination:

nFuncEval 
Number of evaluations of the log likelihood function during the grid search and the iterative ML estimation. 
fitted 
matrix. Fitted “frontier” values of the dependent variable: each row corresponds to a crosssection; each column corresponds to a time period. 
resid 
matrix. Residuals: each row corresponds to a crosssection; each column corresponds to a time period. 
validObs 
vector of logical values indicating which observations
of the provided data were used for the estimation,
i.e. do not have values that are not available ( 
Author(s)
Tim Coelli and Arne Henningsen
References
Battese, G.E. and T. Coelli (1992), Frontier production functions, technical efficiency and panel data: with application to paddy farmers in India. Journal of Productivity Analysis, 3, 153169.
Battese, G.E. and T. Coelli (1995), A model for technical inefficiency effects in a stochastic frontier production function for panel data. Empirical Economics, 20, 325332.
Coelli, T. (1996) A Guide to FRONTIER Version 4.1: A Computer Program for Stochastic Frontier Production and Cost Function Estimation, CEPA Working Paper 96/08, http://www.uq.edu.au/economics/cepa/frontier.htm, University of New England.
Himmelblau, D.M. (1972), Applied NonLinear Programming, McGrawHill, New York.
See Also
frontierQuad
for quadratic/translog frontiers,
summary.frontier
for creating and printing summary results,
efficiencies.frontier
for calculating efficiency estimates,
lrtest.frontier
for comparing models by LR tests,
fitted.frontier
for obtaining the fitted “frontier” values,
ang residuals.frontier
for obtaining the residuals.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57  # example included in FRONTIER 4.1 (crosssection data)
data( front41Data )
# CobbDouglas production frontier
cobbDouglas < sfa( log( output ) ~ log( capital ) + log( labour ),
data = front41Data )
summary( cobbDouglas )
# load data about rice producers in the Philippines (panel data)
data( riceProdPhil )
# Error Components Frontier (Battese & Coelli 1992)
# with observationspecific efficiencies (ignoring the panel structure)
rice < sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ),
data = riceProdPhil )
summary( rice )
# Error Components Frontier (Battese & Coelli 1992)
# with "true" fixed individual effects and observationspecific efficiencies
riceTrue < sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) +
factor( FMERCODE ), data = riceProdPhil )
summary( riceTrue )
# add data set with information about its panel structure
library( "plm" )
ricePanel < plm.data( riceProdPhil, c( "FMERCODE", "YEARDUM" ) )
# Error Components Frontier (Battese & Coelli 1992)
# with timeinvariant efficiencies
riceTimeInv < sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ),
data = ricePanel )
summary( riceTimeInv )
# Error Components Frontier (Battese & Coelli 1992)
# with timevariant efficiencies
riceTimeVar < sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ),
data = ricePanel, timeEffect = TRUE )
summary( riceTimeVar )
# Technical Efficiency Effects Frontier (Battese & Coelli 1995)
# (efficiency effects model with intercept)
riceZ < sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) 
EDYRS + BANRAT, data = riceProdPhil )
summary( riceZ )
# Technical Efficiency Effects Frontier (Battese & Coelli 1995)
# (efficiency effects model without intercept)
riceZ2 < sfa( log( PROD ) ~ log( AREA ) + log( LABOR ) + log( NPK ) 
EDYRS + BANRAT  1, data = riceProdPhil )
summary( riceZ2 )
# Cost Frontier (with land as quasifixed input)
riceProdPhil$cost < riceProdPhil$LABOR * riceProdPhil$LABORP +
riceProdPhil$NPK * riceProdPhil$NPKP
riceCost < sfa( log( cost ) ~ log( PROD ) + log( AREA ) + log( LABORP )
+ log( NPKP ), data = riceProdPhil, ineffDecrease = FALSE )
summary( riceCost )
