knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The goal of treeco is to make it easy for R users to extract ecosystem and economic benefits of trees. Similar tools like i-Tree, Davey Tree Calculator, and OpenTreeMap are also available and I would encourage you to check them out. These tools heavily influenced treeco.
Note that this package is currently labeled as . Users should expect breaking changes. One example being that eco_run_all
requires both the common name and botanical name fields where it once only required the former field. I'm doing my best to only make these changes when it makes sense and improves treeco significantly. My goal is to eventually change that label to stable. If that hasn't scared you away, please keeping reading on!
I'm going to use the trees dataset provided in base R:
str(trees)
We have 3 variables and 31 observations. The first thing to look at is the variables. We have:
We are missing three important and required bits of information:
Those three fields along with DBH are required to extract eco benefits. Below is an explanation of why:
Given the data we have, we can't extract the benefits, we're missing too many fields. Fortunately, there is some info we can use in the docs, type ?trees
in the R console to take a look. We see that these are Black Cherry trees. After some googling, I find that these trees were collected in the Allegheny National Forest in Pennsylvania. I'm going to add the common name as a field common and add a row number field rn. More on why rn is added later.
library(treeco) library(dplyr) library(tibble) trees <- trees %>% mutate(common = "black cherry tree") %>% rownames_to_column("rn") %>% as_tibble() %>% print()
Now all that's left is to identify the botanical name for a Black Cherry tree. This is required because all benefits rely on a 3,000+ master species list created by i-Tree.
Since R is very strict, the value "black common tree" will not match i-Tree's "Black cherry tree" because of the capital "B". Even worse, i-Tree might call it "Black cherry" and omit the word "tree" which makes the link between the two that much more difficult to identify. The best treeco can do is quantify the similarity between the users data and that master species list and then link the most similar record found in i-Tree. It first does this for the common name field and then the botanical and this is why both fields are required, to maximize the number of matches. This is where eco_guess
plays a role, for example:
x <- c("common fig", "Commn FIG", "RED MAPLE") eco_guess(x, "botanical")
And for the trees dataset, I can do something like:
trees <- trees %>% mutate(botanical = eco_guess(common, "botanical")) %>% print()
Finally, we need to identify the region code for Pennsylvania. I don't have a great way of doing this. Adding a function for identifying the region code via zipcode, state, city, etc. is on my list. For now, you can use Davey Tree's tree benefit calculator to figure out the region and then take a look at the money dataset for the code:
tmoney tmoney %>% filter(region_name == "Northeast") %>% distinct(region) %>% .[[1]]
Before we calculate the benefits, it should be noted that most of the steps above won't be necessary, they're only there to construct and describe a dataframe that eco_run_all
needs. In most cases, the typical workflow will be:
my_trees <- eco_run_all( data = trees, common_col = "common", botanical_col = "botanical", dbh_col = "Girth", region = "NoEastXXX" ) %>% as_tibble() %>% print()
Notice that the height and volume fields are missing. This is because eco_run_all
strips the input data of everything except what it needs: the row number, common name, botanical name, and dbh field. It does this in an effort to keep the data small. Not too long ago, eco_run_all
took 2 and half minutes to calculate the benefits for 400,000 trees. It now takes a couple seconds depending on how unique the data is. The removal of unneeded data is why I add a field rn at the beginning, to preserve the row number and link it to the benefits dataset my_trees
:
trees %>% select(rn, Height, Volume) %>% right_join(my_trees) %>% glimpse()
This is especially useful given that most tree data is spatial and includes coordinates. Whether or not the approach (stripping data, then joining at the end) is a good idea is certainly up for debate and is another reminder of why this package is experimental.
I have a couple ideas for the future of treeco:
eco_run_all
to tell the user how many records were matched.sf
package for mapping and other applications like guessing the users region.expanded
to include additional benefits.Any criticism, issues, enhancements are encouraged and can be filed here.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.