knitr::opts_chunk$set(collapse = T, comment = "#>", fig.align = "center", fig.width = 5) options(tibble.print_min = 4L, tibble.print_max = 4L) library(ggplot2) library(snfa) set.seed(100)
Economic efficiency analyses consider two separate measures of efficiency:
Technical efficiency: How effectively inputs are transformed into outputs
Allocative inefficiency: How effectively the production plan maximizes profit
Traditionally, data envelopment analysis (DEA) and stochastic frontier analysis (SFA) have been used to estimate technical efficiency. Both methods estimate a production frontier that quantifies the maximum quantity of output producible by a given bundle of inputs. The observed output of a decision making unit (DMU) can be compared to the frontier to arrive at an estimate of efficiency. SFA assumes efficiency is randomly determined and follows a distribution common across DMUs (Aigner et al. 1977). A standard stochastic frontier model takes the form $$y_i = f(X_i) - \delta_i + \varepsilon_i,$$ where $i$ indexes the DMU, $y_i$ is log-output, $f$ is a transformation function, $X_i$ are inputs, $\delta_i > 0$ is log-efficiency, and $\varepsilon_i$ is observational error. The log-efficiency term $\delta_i$ may be assumed to follow a half-normal, exponential, or other positively-defined distribution. Traditionally, SFA uses a parametric approximation for $f$.
DEA, on the other hand, assumes the production frontier is deterministic and defined by highest observed output (conditional on inputs used). The estimated frontier is piecewise-linear connecting maximum observed output conditional on inputs used while enforcing monotonicity and concavity constraints (as well as constraints on returns to scale, if those are specified). An example DEA frontier using variable returns to scale is shown below:
library(ggplot2) # library(rDEA) #rDEA not installing with travis-ci, dea method below is included in snfa data(univariate) dea.fit <- dea(univariate$x, univariate$y, univariate$x, univariate$y, model = "output", RTS = "variable") univariate$frontier <- univariate$y / dea.fit$thetaOpt ggplot(univariate, aes(x, y)) + geom_point() + geom_line(aes(y = frontier), color = "red")
Allocative efficiency is evaluated by checking whether DMUs' first-order condition for profit-maximization is satisifed. Specifically, if DMUs are price-takers in input and output markets, first-order conditions are given by $$\frac{\partial f(X_i)}{\partial x^j} = \frac{w_i^j}{p_i},$$ where superscripts index inputs, $w_i^j$ is the cost of input $j$ to DMU $i$, and $p_i$ is the price of output for DMU $i$. If efficiency is multiplicative, so that $f(X_i) = p(X_i)\delta_i$ for a frontier function $p$, then the first-order condition can be expressed as $$\delta_i\frac{\partial p(X_i)}{\partial x^j} = \frac{w_i^j}{p_i}.$$ Empircally, log-overallocation of input $j$ by DMU $i$ can be estimated by $$\log\left(w_i^j\right) + \log\left(p_i\right) - \log\left(\delta_i\frac{\partial p(X_i)}{\partial x^j}\right).$$ If this quantity is positive, DMU $i$ used more of input $j$ than would be profit maximizing.
As detailed above, to estimate allocative inefficiency one needs to be able to estimate marginal input productivities, which are derivatives of the production function/frontier. Since DEA is piecewise-linear, there can be points on the estimated frontier where the derivative is undefined. Using the previous example, the points in blue in the following plot have undefined derivatives.
tol = 1e-3 univariate$undefined <- abs(dea.fit$thetaOpt - 1) < tol ggplot(univariate, aes(x, y)) + geom_line(aes(y = frontier), color = "red") + geom_point(aes(color = undefined)) + scale_color_manual(values = c("black", "dodgerblue1")) + theme(legend.position = "none")
Thus, DEA is inappropriate for estimating allocative inefficiency.
Smooth non-parametric frontier analysis (SNFA) is a smooth analogue of DEA (Racine et al. 2009). It uses constrained kernel smoothing that ensures the estimated frontier lies above observed output for each observation, as well as imposing monotonicity or concavity constraints, if specified. An increasing, concave boundary is fit below:
X <- as.matrix(univariate$x) y <- univariate$y N.fit <- 100 X.fit <- as.matrix(seq(min(X), max(X), length.out = N.fit)) #Reflect data for fitting reflected.data <- reflect.data(X, y) X.eval <- reflected.data$X y.eval <- reflected.data$y frontier.mc <- fit.boundary(X.eval, y.eval, X.bounded = X, y.bounded = y, X.constrained = X.fit, X.fit = X.fit, method = "mc") frontier.df <- data.frame(x = X.fit, y = frontier.mc$y.fit) ggplot(univariate, aes(x, y)) + geom_point() + geom_line(data = frontier.df, color = "red") slope.df <- data.frame(x = X.fit, slope = frontier.mc$gradient.fit) ggplot(slope.df, aes(x, slope)) + geom_line()
A more in-depth example examining different constraints can be found in the example for fit.boundary
:
example("fit.boundary")
The function allocative.efficiency
uses SNFA to fit a production frontier, then uses the frontier to derive estimates of marginal input productivities. Those marginal productivities are compared to ratio of input to output prices to arrive at an estimate of overallocation. The example in allocative.efficiency
estimates overallocation of labor and capital in the U.S. using macroeconomic data. First, data is loaded and cleaned:
data(USMacro) USMacro <- USMacro[complete.cases(USMacro),] #Extract data X <- as.matrix(USMacro[,c("K", "L")]) y <- USMacro$Y X.price <- as.matrix(USMacro[,c("K.price", "L.price")]) y.price <- rep(1e9, nrow(USMacro)) #Price of $1 billion of output is $1 billion
Then, the model is fit with allocative.efficiency
:
#Run model efficiency.model <- allocative.efficiency(X, y, X.price, y.price, X.constrained = X, model = "br", method = "mc")
Finally, results are plotted and average overallocation is estimated:
#Plot technical/allocative efficiency over time library(ggplot2) technical.df <- data.frame(Year = USMacro$Year, Efficiency = efficiency.model$technical.efficiency) ggplot(technical.df, aes(Year, Efficiency)) + geom_line() allocative.df <- data.frame(Year = rep(USMacro$Year, times = 2), log.overallocation = c(efficiency.model$log.overallocation[,1], efficiency.model$log.overallocation[,2]), Variable = rep(c("K", "L"), each = nrow(USMacro))) ggplot(allocative.df, aes(Year, log.overallocation)) + geom_line(aes(color = Variable)) #Estimate average overallocation across sample period lm.model <- lm(log.overallocation ~ 0 + Variable, allocative.df) summary(lm.model)
Aigner D, Lovell CK, Schmidt P (1977). "Formulation and estimation of stochastic frontier production function models." Journal of Econometrics, 6(1), 21-37.
Racine JS, Parmeter CF, Du P (2009). "Constrained nonparametric kernel regression: Estimation and inference." Working paper.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.