library(FFTrees) options(digits = 3) knitr::opts_chunk$set(echo = TRUE, fig.width = 7.5, fig.height = 7.5, dpi = 100, out.width = "600px", fig.align='center', message = FALSE)

knitr::include_graphics("../inst/mushrooms.jpg")

The `mushrooms`

dataset contains data about mushrooms (see `?mushrooms`

for details). The goal of our model is to predict which mushrooms are poisonous based on `r ncol(mushrooms) - 1`

cues ranging from the mushroom's odor, color, etc.

Here are the first few rows of the data:

```
head(mushrooms)
```

Let's create some trees using `FFTrees()`

, we'll use the `train.p = .5`

argument to split the original data into a 50\% training set and a 50\% testing set.

# Create FFTs from the mushrooms data set.seed(100) # For replicability of the training / test data split mushrooms.fft <- FFTrees(formula = poisonous ~., data = mushrooms, train.p = .5, # Split data into 50\50 training \ test main = "Mushrooms", decision.labels = c("Safe", "Poison"))

Here's basic information about the best performing FFT:

```
# Print information about the best performing tree
mushrooms.fft
```

Let's look at the individual cue training accuracies with `plot()`

:

# Show mushrooms cue accuracies plot(mushrooms.fft, what = "cues")

It looks like the cues `oder`

and `sporepc`

are the best predictors. in fact, the single cue *odor* has a hit rate of 97% and a false alarm rate of 0%! Based on this, we should expect the final trees to use just these cues.

Now let's plot the best training tree applied to the test dataset

# Plot the best FFT for the mushrooms data plot(mushrooms.fft, data = "test")

Indeed, it looks like the best tree only uses the *odor* and *sporepc* cues. In our test dataset, the tree had a false alarm rate of 0% (1 - specificity), and a hit rate of 85%.

Now, let's say that you talk to a mushroom expert who says that we are using the wrong cues. According to her, the best predictors for poisonous mushrooms are *ringtype* and *ringnum*. Let's build a set of trees with these cues and see how they perform relative to our initial tree:

# Create trees using only ringtype and ringnum mushrooms.ring.fft <- FFTrees(formula = poisonous ~ ringtype + ringnum, data = mushrooms, train.p = .5, main = "Mushrooms (Ring Only)", decision.labels = c("Safe", "Poison"))

Here is the best training tree:

plot(mushrooms.ring.fft, data = "test")

As we can see, this tree did not perform nearly as well as our earlier one.

knitr::include_graphics("../inst/virginica.jpg")

The `iris.v`

dataset contains data about 150 flowers (see `?iris.v`

). Our goal is to predict which flowers are of the class Virginica. In this example, we'll create trees using the entire dataset (without an explicit test dataset)

iris.fft <- FFTrees(formula = virginica ~., data = iris.v, main = "Iris", decision.labels = c("Not-V", "V"))

First, let's look at the individual cue training accuracies:

plot(iris.fft, what = "cues")

It looks like the cues *pet.wid* and *pet.len* are the best predictors. Based on this, we should expect the final trees will likely use just one or both of these cues

Now let's plot the best tree

```
plot(iris.fft)
```

Indeed, it looks like the best tree only uses the *pet.wid* and *pet.len* cues. In our test dataset, the tree had a sensitivity of 100% and specificity of 95\%.

Now, this tree did quite well, but what if someone wants a tree with the lowest possible false alarm rate. If we look at the ROC plot in the bottom left corner of the plot above, we can see that tree #2 has a specificity close to 100%. Let's look at that tree:

plot(iris.fft, tree = 2) # Show tree #2

As you can see, this tree does indeed have a higher specificity. However, it comes at a cost of a lower sensitivity

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.