knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Let:

Chipman, George, and McCulloch (1998)

Fit Metric

The fit metric is defined as

$$d\left(T_1,T_2\right)=\frac{1}{n}\sum_{i=1}^n m\left(\hat{y}{i1},\hat{y}{i2}\right)$$

where:

$$m\left(y_1,y_2\right)=\begin{cases} 1 & \mbox{if} \ \ y_1=y_2 \ 0 & \mbox{o.w.} \end{cases}$$

Partition Metric

The partition metric is defined as

$$d\left(T_1, T_2\right)=\frac{\sum_{i>k}\left|I_1(i,k)-I_2(i,k)\right|}{n\choose2}$$

where:

$$I_1(i,k)=\begin{cases} 1 & \mbox{if } T_1 \mbox{ places observations } i \mbox{ an } k \mbox{ in the same terminal node} \ 0 & \mbox{o.w.} \end{cases}$$ Note: The metric is scaled to the range of (0,1) by $n\choose2$.

Tree Metric

A metric from Shannon and Banks (1998): Define the tree metric as

$$d(T_1,T_2)=\sum_{r \ \in\ \mbox{nodes}(T_1,T_2)}\alpha_rm\left(\mbox{rule}(T_1,r),\mbox{rule}(T_2,r)\right)$$

where

Shannon and Banks (1998) let

$$m=\begin{cases} 1 & \mbox{if the variables at node } r \mbox{ are the same in both trees} \ 0 & \mbox{o.w.} \end{cases}$$

Banerjee, Ding, and Noone (2012)

Covariate metric

$$d_0(T_1, T_2)=\frac{\mbox{# of covariate mismatches between } T_1 \mbox{ and } T_2}{k}$$

(recall that $k$ is the number of covariates in the data)



goodekat/TreeTracer documentation built on April 19, 2023, 7:44 p.m.