dea.boot: Bootstrap DEA models

The function dea.boot bootstrap DEA models and returns bootstrap of Farrell efficiencies. This function is slower than the boot.sw89 from the package FEAR. The faster function boot.fear is a wrapper for boot.sw89 from the package FEAR returning results directly as Farrell measures.


dea.boot(X, Y, NREP = 200, EFF = NULL, RTS = "vrs", ORIENTATION="in", 
         alpha = 0.05, XREF = NULL, YREF = NULL, FRONT.IDX=NULL, 

boot.fear(X, Y, NREP = 200, EFF = NULL, RTS = "vrs", ORIENTATION = "in", 
         alpha = 0.05, XREF = NULL, YREF = NULL, EREF = NULL)



Inputs of firms to be evaluated, a K x m matrix of observations of K firms with m inputs (firm x input)


Outputs of firms to be evaluated, a K x n matrix of observations of K firms with n outputs (firm x input).


Number of bootstrap replications


Efficiencies for (X,Y) relative to the technology generated from (XREF,YREF).


The returns to scale assumptions as in dea, only works for "vrs", "drs", and "crs"; more to come.


Input efficiency "in" (1), output efficiency "out" (2), and graph efficiency "graph" (3).


One minus the size of the confidence interval for the bias corrected efficiencies


Inputs of the firms determining the technology, defaults to X.


Outputs of the firms determining the technology, defaults to Y.


Index for firms determining the technology.


Efficiencies for the firms in XREF, YREF.


Does not yet work and is therefore not used.


Input and output matrices are K x m and K x n for the default value TRANSPOSE=FALSE; this is standard in R for statistical models. When TRANSPOSE=TRUE data matrices are m x K and n x K.


The bootstrap of the Farrell input efficiencies is done as a Shephard input distance function, the inverse Farrell input efficiency. The option is only relevant for input and graph directions.


Only for debugging purposes.


Possible controls to lpSolveAPI, see the documentation for that package. For examples of use see the function dea.


The details are lightly explained in Bogetoft and Otto (2011) Chap. 6, and with more mathematical details in Dario and Simar (2007) Sect. 3.4 and in Simar and Wilson (1998).

The bootstrap at the moment does not work for any kind of directional efficiency.

The returned confidence intervals are for the bias corrected efficiencies; to get confidence intervals for the uncorrected efficiencies add the biases to both upper and lower values for the intervals.

Under the default option SHEPHARD.INPUT=TRUE bias and bias corrected efficiencies are calculated for Shephard input distance function and then transformed to Farrell input efficiencies to avoid possible negative biased corrected input efficiencies. If this is not wanted use the option SHEPHARD.INPUT=FALSE. This option is only relevant for input and graph oriented directions.


The returned values from both functions are as follows:




Bias-corrected efficiencies


An array of bootstrap bias estimates for the K firms

K x 2 matrix with confidence interval for the estimated efficiencies


An array of bootstrap variance estimates for the K firms


The replica bootstrap estimates of the Farrell efficiencies, a K x NREP matrix


The function dea.boot does not depend on the FEAR package and can therefore be used on computers where the package FEAR is not available. This, however, comes with a time penalty as it takes around 4 times longer to run compared to using FEAR directly

The returned bootstrap estimates from FEAR::boot.sw98 of efficiencies are sorted for each firm individually. Unfortunately, this means that the component of replicas is not the efficiencies for the same bootstrap replica, but could easily be from different bootstrap replicas. This also means that this function can not be used to bootstrap tests for statistical hypotheses where the statistics involves summing of firm's efficiencies.

If a numerical problem occurs, status=5, or if no solution can be found, the best solution is often to scale the input X and output Y yourself or use the option CONTROL to change scaling in the program itself, as described in the notes for dea.


Peter Bogetoft and Lars Otto


x <- matrix(c(100,200,300,500,100,200,600),ncol=1)
y <- matrix(c( 75,100,300,400, 25, 50,400),ncol=1)

e <- dea(x,y)

#  To bootstrap for real, NREP should be at least 2000. Run the
#  following lines a couple of times with nrep=100 and see how the
#  bootstrap frontier changes from one run to the next. Try the same
#  with NREP=2000 even though is does take a longer time to run,
#  especially for dea.boot.
nrep <- 5
# nrep <- 2000

# if ( "FEAR" %in% .packages(TRUE) )  {
##  The following only works if the package FEAR is installed; it does
##  not have to be loaded.
#  b <- boot.fear(x,y, NREP=nrep)
# } else {
  b <- dea.boot(x,y, NREP=nrep)
# }

#  bias corrected frontier$eff.bc*x, y, add=TRUE, lty="dashed")
#  outer 95% confidence interval frontier for uncorrected frontier$[,1]+b$bias)*x, y, add=TRUE, lty="dotted")

## Test of hypothesis in DEA model
# Null hypothesis is that technology is CRS and the alternative is VRS
# Bogetoft and Otto (2011) pages 183--185.
ec <- dea(x,y, RTS="crs")
Ec <- eff(ec)
ev <- dea(x,y, RTS="vrs")
Ev <- eff(ev)
# The test statistic; equation (6.1)
S <- sum(Ec)/sum(Ev)

# To calculate CRS and VRS efficiencies in the same bootstrap replicas
# we reset the random number generator before each call of the
# function dea.boot.

# To get the an initial value for the random number generating process
# we save its state (seed)
save.seed <-,1)

# The bootstrap and calculate CRS and VRS under the assumption that
# the true technology is CRS (the null hypothesis) and such that the
# results corresponds to the case where CRS and VRS are calculated for
# the same reference set of firms; to make this happen we set the
# random number generator to the same state before the calls.
bc <- dea.boot(x,y, nrep,, RTS="crs")
bv <- dea.boot(x,y, nrep,, RTS="vrs", XREF=x,YREF=y, EREF=ec$eff)

# Calculate the statistic for each bootstrap replica
bs <- colSums(bc$boot)/colSums(bv$boot)
# The critical value for the test (default size \code{alpha} of test is 5%)
critValue(bs, alpha=.1)
# Accept the hypothesis at 10% level?
critValue(bs, alpha=.1) <= S

# The probability of observing a smaller value of S when the
# hypothesis is true; the p--value.
typeIerror(S, bs)
# Accept the hypothesis at size level 10%?
typeIerror(S, bs) >= .10

