SFA: Estimate stochastic frontier analysis (SFA) models.
In vh-d/SFAt: SFAt: Stochastic frontier analysis on cross-section and panel data

Description Usage Arguments Details Value Estimation Starting values Optimization Examples

Estimate stochastic frontier analysis (SFA) models.

sfa.fit(y, X, CM = NULL, CV_u = NULL, CV_v = NULL, ineff = -1L,
  dist = c("tnorm", "hnorm", "exp", "tnormp"), spec = NULL,
  intercept = list(f = TRUE, cm = TRUE, cv_u = TRUE, cv_v = TRUE),
  sv = list(f = NULL, cm = NULL, cv_u = NULL, cv_v = NULL),
  opt_strategy = 1, grad = "fd", ll = NULL, optim_method = "BFGS",
  optim_control = NULL, maxLik_method = "NR", maxLik_control = NULL,
  nlopt_opts = NULL, nlopt_bounds = NULL, deb = F, debll = F)

## S3 method for class 'formula'
SFA(formula, data = NULL, cm = ~1, cv_u = ~1,
  cv_v = ~1, form = c("production", "cost"), deb = FALSE, ...)

## S3 method for class 'list'
SFA(formulas, data = NULL, ...)

SFA(object, ...)

`y`	dependent (production/cost) variable.
`X`	variables of the production/cost function.
`CM`	data for conditional mean model of the inefficiency (asymmetric error) term.
`CV_u`	data for conditional variance model of the inefficiency (asymmetric error) term.
`CV_v`	data for conditional variance model of the symmetric error term.
`ineff`	-1 (or 1) for production (or cost) function, where inefficiency decreases (or increases) the total output (or costs).
`dist`	distribution of inefficiency term (either "hnorm", "exp" or "tnorm").
`spec`	specifies what model of endogeneous inefficiency term should be used (currently only bc95 for cross-section implemented).
`intercept`	list of logical values: f TRUE if the intercept term should be added to the main formula. cm TRUE if the intercept should be added to the conditional mean equation for the asymmetric term cv_u TRUE if the intercept should be added to the conditional inefficiency variance formula. cv_v TRUE if the intercept should be added to the conditional inefficiency variance formula.
`sv`	numeric vecor for all the necessary parameters or a list of optional starting values such as: f frontier model coefficients cm starting values for conditional mean model parameters. cv_u starting values for conditional variance of the inefficiency term model parameters. cv_v starting values for conditional variance of the symmetric term model parameters.
`opt_strategy`	integer from 1 – 4, see Details
`ll`	allows custom log-likelihood function that will be MINIMIZED.
`optim_method`	algorithm for (second-step) optimization.
`optim_control`	list of options for `optim()` routine.
`maxLik_method`	algorithm for (second-step) optimization.
`maxLik_control`	list of options for `maxLik()` routine.
`nlopt_opts`	list of nloptr options.
`deb`	debug mode (TRUE/FALSE).
`debll`	debug mode of log likelihood functions (TRUE/FALSE).

sfa.fit() is the main workhorse function that actually estimate the SFA model. The SFA.formula() and SFA.list() methods are provided for more convenient user interface.

For cross-section data model, the following distributions are currently supported:

normal/half-normal model
normal/truncated-normal model
normal/exponential model

Returns object of the class SFA which is a list object consisting:

coeff

coefficients for stochastic frontier model

coeff_cm

coefficients for conditional mean of the inefficiency term model

coeff_cv_u

coefficients for conditional variance of the inefficiency term model (heteroskedasticity in the inefficiency)

coeff_cv_v

coefficients for conditional variance of the symmetric error term model (heteroskedasticity in the frontier model error)

residuals

total residuals (= both u + v terms)

parameters

vector of all parameters returend from miximization of log-likelihood

N

total number of observations

ineff

-1 (1 resp.) for production (cost resp.) function

ineff_name

either "production" or "cost" string

data

list of all data used for estimation (including unit vectors as intercepts if appropriate)

call

is list of

intercept
dist
spec
structure
sv

loglik

Total log-likehood.

hessian

A hessian matrix as returned by optim()

lmfit

lm object result of fitted linear model.

optim

object returned by optim()

nlopt

object returned by nloptr()

Within all these models heteroskedasticity in both symmetric and asymmetric error terms can be explicitly modeled. It can be done by providing matrices of explanatory variables (CV_v for the symmetric error and CV_u for the inefficiency term). Conditional mean of the inefficiency term can be modeled only within the normal/t-normal model. Models are estimated via maximum likelihood estimators following established literature on the topic.

Starting values are by default coefficients of a linear (OLS) model estimated during within the sfa.fit() function. Or they can be supplied by user as a list of vectors.

There are serveral optimizaion strategies for maximizing of log-likelihood functions available:

1 optim() function,
2 using the maxLik() from maxLik package if available,
3 two-step strategy usign any of the wide selection of algorithms from the nloptr package (if available) for first step optimization and optim() in the second step,
4 two-step strategy usign nloptr package (if available) for first step optimization and maxLik() in the second step.

Notice that the choice of optimization method may have significant impact on the results and it is highly recommanded to experiment with different optimization algorithms. See the relevant packages and methods for more details.