fit.sf: Non-parametric stochastic frontier
In tkmckenzie/snfa: Smooth Non-Parametric Frontier Analysis

Description Usage Arguments Details Value References Examples

View source: R/fit.sf.R

Fits stochastic frontier of data with kernel smoothing, imposing monotonicity and/or concavity constraints.

1 2	fit.sf(X, y, X.constrained = NA, H.inv = NA, H.mult = 1, method = "u", scale.constraints = TRUE)

`X`	Matrix of inputs
`y`	Vector of outputs
`X.constrained`	Matrix of inputs where constraints apply
`H.inv`	Inverse of the smoothing matrix (must be positive definite); defaults to rule of thumb
`H.mult`	Scaling factor for rule of thumb smoothing matrix
`method`	Constraints to apply; "u" for unconstrained, "m" for monotonically increasing, and "mc" for monotonically increasing and concave
`scale.constraints`	Boolean, whether to scale constraints by their average value, can help with convergence

This method fits non-parametric stochastic frontier models. The data-generating process is assumed to be of the form

\ln y_i = \ln f(x_i) + v_i - u_i,

where y_i is the ith observation of output, f is a continuous function, x_i is the ith observation of input, v_i is a normally-distributed error term (v_i\sim N(0, σ_v^2)), and u_i is a normally-distributed error term truncated below at zero (u_i\sim N^+(0, σ_u)). Aigner et al. developed methods to decompose \varepsilon_i = v_i - u_i into its basic components.

This procedure first fits the mean of the data using fit.mean, producing estimates of output \hat{y}. Log-proportional errors are calculated as

\varepsilon_i = \ln(y_i / \hat{y}_i).

Following Aigner et al. (1977), parameters of one- and two-sided error distributions are estimated via maximum likelihood. First,

\hat{σ}^2 = \frac1N ∑_{i=1}^N \varepsilon_i^2.

Then, \hat{λ} is estimated by solving

\frac1{\hat{σ}^2} ∑_{i=1}^N \varepsilon_i\hat{y}_i + \frac{\hat{λ}}{\hat{σ}} ∑_{i=1}^N \frac{f_i^*}{1 - F_i^*}y_i = 0,

where f_i^* and F_i^* are standard normal density and distribution function, respectively, evaluated at \varepsilon_i\hat{λ}\hat{σ}^{-1}. Parameters of the one- and two-sided distributions are found by solving the identities

σ^2 = σ_u^2 + σ_v^2

λ = \frac{σ_u}{σ_v}.

Mean efficiency over the sample is given by

\exp≤ft(-\frac{√{2}}{√{π}}\right)σ_u,

and modal efficiency for each observation is given by

-\varepsilon(σ_u^2/σ^2).

Returns a list with the following elements

`y.fit`	Estimated value of the frontier at X.fit
`gradient.fit`	Estimated gradient of the frontier at X.fit
`mean.efficiency`	Average efficiency for X, y as a whole
`mode.efficiency`	Modal efficiencies for each observation in X, y
`X.eval`	Matrix of inputs used for fitting
`X.constrained`	Matrix of inputs where constraints apply
`X.fit`	Matrix of inputs where curve is fit
`H.inv`	Inverse smoothing matrix used in fitting
`method`	Method used to fit frontier
`scaling.factor`	Factor by which constraints are multiplied before quadratic programming

\insertRef

AignerLovellSchmidtsnfa

\insertRefParmeterRacinesnfa

data(USMacro)

USMacro <- USMacro[complete.cases(USMacro),]

# Extract data
X <- as.matrix(USMacro[,c("K", "L")])
y <- USMacro$Y

# Fit frontier
fit.sf <- fit.sf(X, y,
                 X.constrained = X,
                 method = "mc")

print(fit.sf$mean.efficiency)
# [1] 0.9772484

# Plot efficiency over time
library(ggplot2)

plot.df <- data.frame(Year = USMacro$Year,
                      Efficiency = fit.sf$mode.efficiency)

ggplot(plot.df, aes(Year, Efficiency)) +
  geom_line()