teradialbc: Statistical Inference Regarding the Radial Measure of...
In npsf: Nonparametric and Stochastic Efficiency and Productivity Analysis

Description Usage Arguments Details Value Note Author(s) References See Also Examples

teradialbc(formula, data, subset,
 ref = NULL, data.ref = NULL, subset.ref = NULL,
 rts = c("C", "NI", "V"), base = c("output", "input"),
 homogeneous = TRUE, smoothed = TRUE, kappa = NULL,
 reps = 999, level = 95,
 core.count = 1, cl.type = c("SOCK", "MPI"),
 print.level = 1, dots = TRUE)

`formula`	an object of class “formula” (or one that can be coerced to that class): a symbolic description of the model. The details of model specification are given under ‘Details’.
`data`	an optional data frame containing the variables in the model. If not found in data, the variables are taken from environment (`formula`), typically the environment from which `teradial` is called.
`subset`	an optional vector specifying a subset of observations for which technical efficiency is to be computed.
`rts`	character or numeric. string: first letter of the word “c” for constant, “n” for non-increasing, or “v” for variable returns to scale assumption. numeric: 3 for constant, 2 for non-increasing, or 1 for variable returns to scale assumption.
`base`	character or numeric. string: first letter of the word “o” for computing output-based or “i” for computing input-based technical efficiency measure. string: 2 for computing output-based or 1 for computing input-based technical efficiency measure
`ref`	an object of class “formula” (or one that can be coerced to that class): a symbolic description of inputs and outputs that are used to define the technology reference set. The details of technology reference set specification are given under ‘Details’. If reference is not provided, the technical efficiency measures for data points are computed relative to technology based on data points themselves.
`data.ref`	an optional data frame containing the variables in the technology reference set. If not found in `data.ref`, the variables are taken from environment(ref), typically the environment from which `teradial` is called.
`subset.ref`	an optional vector specifying a subset of observations to define the technology reference set.
`smoothed`	logical. If TRUE, the reference set is bootstrapped with smoothing; if FALSE, the reference set is bootstrapped with subsampling.
`homogeneous`	logical. Relevant if `smoothed=TRUE`. If TRUE, the reference set is bootstrapped with homogeneous smoothing; if FALSE, the reference set is bootstrapped with heterogeneous smoothing.
`kappa`	relevant if `smoothed=TRUE`. 'kappa' sets the size of the subsample as K^kappa, where K is the number of data points in the original reference set. The default value is 0.7. 'kappa' may be between 0.5 and 1.
`reps`	specifies the number of bootstrap replications to be performed. The default is 999. The minimum is 100. Adequate estimates of confidence intervals using bias-corrected methods typically require 1,000 or more replications.
`level`	sets confidence level for confidence intervals; default is `level = 95`.
`core.count`	positive integer. Number of cluster nodes. If `core.count=1`, the process runs sequentially. See `performParallel` in package `snowFT` for more details.
`cl.type`	Character string that specifies cluster type (see `makeClusterFT` in package `snowFT`). Possible values are 'MPI' and 'SOCK' ('PVM' is currently not available). See `performParallel` in package `snowFT` for more details.
`dots`	logical. Relevant if `print.level>=1`. If TRUE, one dot character is displayed for each successful replication; if FALSE, display of the replication dots is suppressed.
`print.level`	numeric. 0 - nothing is printed; 1 - print summary of the model and data. 2 - print summary of technical efficiency measures. 3 - print estimation results observation by observation. Default is 1.

Routine teradialbc performs bias correction of the radial Debrue-Farrell input- or output-based measure of technical efficiency, computes bias and constructs confidence intervals via bootstrapping techniques. See Simar and Wilson (1998) doi: 10.1287/mnsc.44.1.49, Simar and Wilson (2000) doi: 10.1080/02664760050081951, Kneip, Simar, and Wilson (2008) doi: 10.1017/S0266466608080651, and references with links below.

Models for teradialbc are specified symbolically. A typical model has the form outputs ~ inputs, where outputs (inputs) is a series of (numeric) terms which specifies outputs (inputs). The same goes for reference set. Refer to the examples.

If core.count>=1, teradialbc will perform bootstrap on multiple cores. Parallel computing requires package snowFT. By the default cluster type is defined by option cl.type="SOCK". Specifying cl.type="MPI" requires package Rmpi.

On some systems, specifying option cl.type="SOCK" results in much quicker execution than specifying option cl.type="MPI". Option cl.type="SOCK" might be problematic on Mac system.

Parallel computing make a difference for large data sets. Specifying option dots=TRUE will indicate at what speed the bootstrap actually proceeds. Specify reps=100 and compare two runs with option core.count=1 and core.count>1 to see if parallel computing speeds up the bootstrap. For small samples, parallel computing may actually slow down the teradialbc.

Results can be summarized using summary.npsf.

teradialbc returns a list of class npsf containing the following elements:

`K`	numeric: number of data points.
`M`	numeric: number of outputs.
`N`	numeric: number of inputs.
`rts`	string: RTS assumption.
`base`	string: base for efficiency measurement.
`reps`	numeric: number of bootstrap replications.
`level`	numeric: confidence level for confidence intervals.
`te`	numeric: radial measure (Russell) of technical efficiency.
`tebc`	numeric: bias-corrected radial measures of technical efficiency.
`biasboot`	numeric: bootstrap bias estimate for the original radial measures of technical efficiency.
`varboot`	numeric: bootstrap variance estimate for the radial measures of technical efficiency.
`biassqvar`	numeric: one-third of the ratio of bias squared to variance for radial measures of technical efficiency.
`realreps`	numeric: actual number of replications used for statistical inference.
`telow`	numeric: lower bound estimate for radial measures of technical efficiency.
`teupp`	numeric: upper bound estimate for radial measures of technical efficiency.
`teboot`	numeric: `reps x K` matrix containing bootstrapped measures of technical efficiency from each of `reps` bootstrap replications.
`esample`	logical: returns TRUE if the observation in user supplied data is in the estimation subsample and FALSE otherwise.

Before specifying option homogeneous it is advised to preform the test of independence (see nptestind). Routine nptestrts may help deciding regarding option rts.

Results can be summarized using summary.npsf.

Oleg Badunenko <oleg.badunenko@brunel.ac.uk>, Pavlo Mozharovskyi <pavlo.mozharovskyi@telecom-paris.fr>

Badunenko, O. and Mozharovskyi, P. (2016), Nonparametric Frontier Analysis using Stata, Stata Journal, 163, 550–89, doi: 10.1177/1536867X1601600302

Färe, R. and Lovell, C. A. K. (1978), Measuring the technical efficiency of production, Journal of Economic Theory, 19, 150–162, doi: 10.1016/0022-0531(78)90060-1

Färe, R., Grosskopf, S. and Lovell, C. A. K. (1994), Production Frontiers, Cambridge U.K.: Cambridge University Press, doi: 10.1017/CBO9780511551710

Kneip, A., Simar L., and P.W. Wilson (2008), Asymptotics and Consistent Bootstraps for DEA Estimators in Nonparametric Frontier Models, Econometric Theory, 24, 1663–1697, doi: 10.1017/S0266466608080651

Simar, L. and P.W. Wilson (1998), Sensitivity Analysis of Efficiency Scores: How to Bootstrap in Nonparametric Frontier Models, Management Science, 44, 49–61, doi: 10.1287/mnsc.44.1.49

Simar, L. and P.W. Wilson (2000), A General Methodology for Bootstrapping in Nonparametric Frontier Models, Journal of Applied Statistics, 27, 779–802, doi: 10.1080/02664760050081951

teradial, tenonradial, tenonradialbc, nptestrts, nptestind, sf

## Not run: 

require( npsf )

# Prepare data and matrices

data( pwt56 )
head( pwt56 )

# Create some missing values

pwt56 [49, "K"] <- NA # just to create missing

Y1 <- as.matrix ( pwt56[ pwt56$year == 1965, c("Y"), drop = FALSE] )
X1 <- as.matrix ( pwt56[ pwt56$year == 1965, c("K", "L"), drop = FALSE] )

X1 [51, 2] <- NA # just to create missing
X1 [49, 1] <- NA # just to create missing

data( ccr81 )
head( ccr81 )

# Create some missing values

ccr81 [64, "x4"] <- NA # just to create missing
ccr81 [68, "y2"] <- NA # just to create missing

Y2 <- as.matrix( ccr81[ , c("y1", "y2", "y3"), drop = FALSE] )
X2 <- as.matrix( ccr81[ , c("x1", "x2", "x3", "x4", "x5"), drop = FALSE] )

# Compute output-based measures of technical efficiency under 
# the assumption of CRS (the default) and perform bias-correctiion
# using smoothed homogeneous bootstrap (the default) with 999
# replications (the default).

t1 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81)

# or just

t2 <- teradialbc ( Y2 ~ X2)

# Combined formula and matrix

t3 <- teradialbc ( Y ~ K + L, data = pwt56, subset = Nu < 10, 
	ref = Y1[-2,] ~ X1[-1,] )

# Compute input-based measures of technical efficiency under 
# the assumption of VRS and perform bias-correctiion using
# subsampling heterogenous bootstrap with 1999 replications.
# Choose to report 99
# formed by data points where x5 is not equal 10. 
# Suppress printing dots.

t4 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, reps = 1999, 
	smoothed = FALSE, kappa = 0.7, dots = FALSE, 
	base = "i", rts = "v", level = 99)

# Compute input-based measures of technical efficiency under
# the assumption of NRS and perform bias-correctiion using 
# smoothed heterogenous bootstrap with 499 replications for 
# all data points. The reference set formed by data points 
# where x5 is not equal 10.

t5 <- teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, homogeneous = FALSE, 
	reps = 999, smoothed = TRUE, dots = TRUE, base = "i", rts = "n")


# ===========================
# ===  Parallel computing ===
# ===========================

# Perform previous bias-correction but use 8 cores and 
# cluster type SOCK

t51 <-  teradialbc ( y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	data = ccr81, ref = y1 + y2 + y3 ~ x1 + x2 + x3 + x4 + x5, 
	subset.ref = x5 != 10, data.ref = ccr81, homogeneous = FALSE, 
	reps = 999, smoothed = TRUE, dots = TRUE, base = "i", rts = "n", 
	core.count = 8, cl.type = "SOCK")


# Really large data-set

data(usmanuf)
head(usmanuf)

nrow(usmanuf)
table(usmanuf$year)

# This will take some time depending on computer power

data(usmanuf)
head(usmanuf)

t6 <- teradialbc ( Y ~ K + L + M, data = usmanuf, 
	subset = year >= 1999 & year <= 2000, homogeneous = FALSE, 
	base = "o", reps = 100, 
	core.count = 8, cl.type = "SOCK")


## End(Not run)