EAT: Efficiency Analysis Trees

View source: R/EAT.R

EATR Documentation

Efficiency Analysis Trees

Description

This function estimates a stepped production frontier through regression trees.

Usage

EAT(
  data,
  x,
  y,
  numStop = 5,
  fold = 5,
  max.depth = NULL,
  max.leaves = NULL,
  na.rm = TRUE
)

Arguments

data

data.frame or matrix containing the variables in the model.

x

Column input indexes in data.

y

Column output indexes in data.

numStop

Minimum number of observations in a node for a split to be attempted.

fold

Set of number of folds in which the dataset to apply cross-validation during the pruning is divided.

max.depth

Depth of the tree.

max.leaves

Maximum number of leaf nodes.

na.rm

logical. If TRUE, NA rows are omitted.

Details

The EAT function generates a regression tree model based on CART \insertCitebreiman1984eat under a new approach that guarantees obtaining a stepped production frontier that fulfills the property of free disposability. This frontier shares the aforementioned aspects with the FDH frontier \insertCitedeprins1984eat but enhances some of its disadvantages such as the overfitting problem or the underestimation of technical inefficiency. More details in \insertCiteesteve2020;textualeat.

Value

An EAT object containing:

  • data

    • df: data frame containing the variables in the model.

    • x: input indexes in data.

    • y: output indexes in data.

    • input_names: input variable names.

    • output_names: output variable names.

    • row_names: rownames in data.

  • control

    • fold: fold hyperparameter value.

    • numStop: numStop hyperparameter value.

    • max.leaves: max.leaves hyperparameter value.

    • max.depth: max.depth hyperparameter value.

    • na.rm: na.rm hyperparameter value.

  • tree: list structure containing the EAT nodes.

  • nodes_df: data frame containing the following information for each node.

    • id: node index.

    • SL: left child node index.

    • N: number of observations at the node.

    • Proportion: proportion of observations at the node.

    • the output predictions.

    • R: the error at the node.

    • index: observation indexes at the node.

  • model

    • nodes: total number of nodes at the tree.

    • leaf_nodes: number of leaf nodes at the tree.

    • a: lower bound of the nodes.

    • y: output predictions.

References

\insertRef

breiman1984eat

\insertRefdeprins1984eat

\insertRefesteve2020eat

Examples

# ====================== #
# Single output scenario #
# ====================== #

simulated <- Y1.sim(N = 50, nX = 3)
EAT(data = simulated, x = c(1, 2, 3), y = 4, numStop = 10, fold = 5, max.leaves = 6)

# ====================== #
#  Multi output scenario #
# ====================== #

simulated <- X2Y2.sim(N = 50, border = 0.1)
EAT(data = simulated, x = c(1,2), y = c(3, 4), numStop = 10, fold = 7, max.depth = 7)


eat documentation built on Jan. 10, 2023, 5:13 p.m.