knitr::opts_chunk$set(collapse = TRUE, comment = "#>") options(tibble.print_min = 4, tibble.print_max = 4)
library(forestexplorR) library(dplyr)
This vignette demonstrates how the neighborhood description and growth calculation functions of forestexplorR can be used to setup a neighborhood model of tree growth or a neighborhood model of tree mortality.
To build a neighborhood model of tree growth, we first need to describe tree
neighborhoods. The forestexplorR functions neighborhoods()
and
neighborhood_summary()
make this very simple.
# Construct neighborhoods for all trees in all stands nbhds <- neighborhoods(mapping, radius = 10) # Describe neighborhoods using angular species-specific densities nbhd_summ <- neighborhood_summary(nbhds, id_column = "tree_id", radius = 10, densities = "angular") # Combine neighborhoods with their summaries nbhds <- nbhds %>% left_join(nbhd_summ %>% select(-c(shannon, simpson, evenness)), by = "tree_id")
We have now summarized the neighborhoods of all trees, but our summaries are inaccurate for trees whose neighborhood overlaps a stand boundary i.e. we did not sample their entire neighborhood. Rather than estimating the missing portions of these trees' neighborhoods, we will exclude them from the model.
# Define neighborhood radius nb_rad <- 10 # Remove trees whose neighborhood overlaps a stand boundary nbhds <- nbhds %>% filter(x_coord >= nb_rad & x_coord <= 100 - nb_rad & y_coord >= nb_rad & y_coord <= 100 - nb_rad)
Next we calculate annual growth for our trees. We will also remove any trees that were only measured once (because we cannot calculate a growth rate for them) or had a nonsensical negative growth rate.
# Calculate annual growth for all trees growth <- growth_summary(tree) # Remove trees that were only measured once and/or had negative growth growth <- growth %>% filter(growth$first_record != growth$last_record & annual_growth >= 0)
Now we can add the growth rates to the neighborhoods data. For our model, we
will use size_corr_growth
because this is the closest to being normally
distributed. When combining we use inner_join()
because not all trees in
nbhds
have growth data e.g. trees measured only one time.
nbhds <- nbhds %>% inner_join(growth %>% select(tree_id, size_corr_growth), by = "tree_id")
We can also include additional covariates in our model such as abiotic data. In
fact we recommend that for models including data from multiple mapped stands,
additional explanatory variables that account for the major axes of variation
among the plots are included to account for potential spatial autocorrelation.
The code below adds abiotic data to nbhds
from the built-in dataset
stand_abiotic
.
nbhds <- nbhds %>% left_join(stand_abiotic, by = "stand_id")
Finally, we can drop some of the columns from nbhds
that we don't need for
our model.
nbhds <- nbhds %>% select(-c(stand_id, dbh, abh, x_coord, y_coord, id_comp, abh_comp))
Now we have a dataset containing all the variables we will use for this modeling demonstration.
growth_model()
: Create a neighborhood tree growth modelWhen modeling tree growth, it is common practice to create a separate model for each focal tree species because the drivers of growth are likely to differ among species. The code below isolates a single tree species and then removes the species variable because it is no longer useful.
one_species <- nbhds %>% filter(species == "ABAM") %>% select(-species)
The growth_model()
function also offers test set validation, so next we will
divide our dataset into a training and test set.
# Create vector of tree ids all_tree_ids <- unique(one_species$tree_id) # Assign 80% of trees to training training_trees <- sample(all_tree_ids, length(all_tree_ids) * 0.8) # Define training and test training <- one_species %>% filter(tree_id %in% training_trees) test <- setdiff(one_species, training)
Now we can call the growth_model()
function, which fits a linear
regularized regression model of tree growth. In the most basic use case, this
function requires only a training set and the name of the dependent variable,
which should be a column in the training set. We set the seed because the model
fitting process is stochastic.
set.seed(1000) basic_growth_mod <- growth_model(training, outcome_var = "size_corr_growth")
The output of the above code is a list containing 4 elements. The first element
mod
is the model object. The second element obs_pred
is a data frame of
observed and model predicted growth for each tree. The third element is the
R^2 value of the model. The fourth element is the estimated coefficients.
# R^2 value basic_growth_mod$R_squared # Observations and model predictions basic_growth_mod$obs_pred
The evaluate the robustness of the stochastic model fitting process there is an
optional argument (iterations
) that specifies how many times the model should
be fit. When this argument is specified, the model object, r-squared value and
table of observations and predictions refer to the model that had the lowest
mean square error. However, the mod_coef
element will contain the fitted
coefficients for all fitted models.
set.seed(1000) iter_growth_mod <- growth_model(training, outcome_var = "size_corr_growth", iterations = 10) iter_growth_mod$R_squared
The growth_model()
function can also handle rare competitor tree species.
The optional argument rare_comps
specifies the minimum number of interactions
a competitor species must be a part of to be considered a common competitor. Any
species with fewer interactions will be grouped together under the new species
identity of "RARE"
. If species-specific densities are included as covariates
in the training data, a "RARE_density"
variable can also be created by
indicating the suffix attached to density variables under the argument
density_suffix
.
set.seed(1000) rare_growth_mod <- growth_model(training, outcome_var = "size_corr_growth", rare_comps = 100, density_suffix = "_angle_sum") # Check for RARE in coefficients names(rare_growth_mod$mod_coef)
Finally, the growth_model()
function can evaluate the ability of the
fitted model to predict the growth of trees in a test set i.e. trees not used in
the model fitting process. To do this, a test set must be provided under the
optional argument test
. Predictive ability is quantified with an R^2 value
that constitutes a new element in the output list.
set.seed(1000) test_growth_mod <- growth_model(training, outcome_var = "size_corr_growth", test = test) # Ability to predict test data test_growth_mod$test_R_squared
mortality_model()
: Create a neighborhood tree mortality modelThe data formatting for a neighborhood mortality model is very similar to that for a neighborhood growth model. First we can calculate our neighborhood summaries and remove trees whose neighborhood overlaps the stand boundary:
# Construct neighborhoods for all trees in all stands nbhds <- neighborhoods(mapping, radius = 10) # Describe neighborhoods using angular species-specific densities nbhd_summ <- neighborhood_summary(nbhds, id_column = "tree_id", radius = 10, densities = "angular") # Combine neighborhoods with their summaries nbhds <- nbhds %>% left_join(nbhd_summ %>% select(-c(shannon, simpson, evenness)), by = "tree_id") # Define neighborhood radius nb_rad <- 10 # Remove trees whose neighborhood overlaps a stand boundary nbhds <- nbhds %>% filter(x_coord >= nb_rad & x_coord <= 100 - nb_rad & y_coord >= nb_rad & y_coord <= 100 - nb_rad)
Next we extract the mortality data for each tree from our tree measurement data frame and join it to our neighborhoods data:
mortality <- tree %>% group_by(tree_id) %>% summarize(mort = max(mort)) nbhds <- nbhds %>% inner_join(mortality, by = "tree_id")
Then we add the stand abiotic data and remove the columns we don't want to include as model covariates:
nbhds <- nbhds %>% left_join(stand_abiotic, by = "stand_id") nbhds <- nbhds %>% select(-c(stand_id, dbh, abh, x_coord, y_coord, id_comp, abh_comp))
Now we can subset to a single focal species:
one_species <- nbhds %>% filter(species == "ABAM") %>% select(-species)
For this example we will not use test set validation, so we can now fit the mortality model:
set.seed(1000) basic_mort_mod <- mortality_model(one_species, outcome_var = "mort")
Here is the model output:
# R^2 value basic_mort_mod$R_squared # Fitted coefficients t(basic_mort_mod$mod_coef)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.