Plot design optimization"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

The function metrics.variables used for the calculation of stand-level variables and metrics (see vignette "Stand-level") requires arguments specifying the plot designs and sizes. If the optimal plot design and size for the calculation of stand-level variables is not known, the optimal plots design for the corresponding TLS data can be determined by two different approaches implemented in FORTLS. The approaches depend on whether field data for the sample plots is available or not.

dir.data <- system.file("data", package="FORTLS")
setwd(dir.data)
load("Rioja.data.RData")
load("Rioja.simulations.RData")
tree.tls <- Rioja.data$tree.tls
tree.field <- Rioja.data$tree.field
dir.data <- system.file("exdata", package="FORTLS")
library(FORTLS)

Estimating optimal plot size without field data

If no field data is available, the function estimation.plot.size can be applied to determine the optimal plot design and size. This function uses the data frame containing the list of detected trees (introduced in tree.tls) and estimates stand-level density ($N$, trees/ha) and basal area ($G$, m$^2$/ha) for many simulated differently-sized plots and the three plot designs (circular fixed area, k-tree and angle-count) by increasing continuously their sizes.

Thus, circular fixed area plots with increasing radius (increment of 0.1 m) to the maximum radius defined by radius.max in plot.parameters (by default set to 25, if radius is larger than furthest tree, the horizontal distance to this furthest tree is considered as maximum radius) will be simulated and for each plot, the variables (N and G) are estimated. Similarly, k-tree plots with tree numbers ($k$) ranging from 1 to k.max (specified in plot.parameters, default value set to 50 or total number of trees in the plot) and angle-count plots with increasing basal area factor (BAF, increments of 0.1 m$^2$/ha) to the maximum value specified by BAF.max in plot.parameters (set to 4 by default) are simulated and the respective stand-level variables are calculated. Optionally, the minimum diameter at breast height (dbh.min, in cm) to include the trees in the estimations can be defined. By default the minimum $dbh$ is set to 4 cm.

The function generates size-estimation charts i.e., plots showing the estimated stand-level density ($N$) and basal area ($G$) on the $y$ axes respective to the different plot sizes ($x$ axes). The estimations will be performed for simulated plots corresponding to all sample plots. By default the output graphs will contain one line for each sample plot. When average is set to TRUE, the average of all estimations (for all plots) as a continuous line and the standard deviation as grey shaded area will be drawn instead of multiple lines for each sample plot. One chart for each plot design is drawn by default. If all.plot.designs is set to TRUE, the line charts of all three plot design will be drawn in one graph with different colours for each plot design.

estimation.plot.size(tree.tls = tree.tls,
                     plot.parameters = data.frame(radius.max = 25, k.max = 50, BAF.max = 4),
                     dbh.min = 4,
                     average = TRUE, all.plot.designs = FALSE)

The continuous line represents the average over all sample plots (i.e. 16 plots in the example shown here) of the estimated density ($N$) on the left and the basal area ($G$) on the right. The dotted line indicates the number of sample plots. This figure helps to find suitable plot designs for the calculation of stand-level metrics and variables. The optimal plot design and size should be chosen within a range where the estimated values for $N$ and $G$ reach a stable level. A too small plot leads to high errors of estimation, since only few trees enter the plot and therefore the sample is too small. In the example above, the basal area estimated for fixed area plots with radius smaller than 5 m is much higher (around 40-50 m$^2$/ha) than the true value (around 20 m$^2$/ha). On the other hand, too large plots come along with systematic errors due to occlusion of trees. Therefore, the basal area in the same example of fixed area plots with radius bigger than 20 m is estimated lower than the true value. In order to avoid both types of errors, the figure helps to find a plot size range with stable values.

Validation with field data and optimizing plot design

When data from field measurements are available for the same sample plots, the TLS-based estimates can be validated and the optimal plot designs can be found applying functions implemented in FORTLS. In the first step of the optimization process, the function simulations simulates plots with incremental size and computes the corresponding stand-level metrics and variables (similar to the function metrics.variables, see "Stand-level" vignette). Based on the simulated data, two different processes can be performed. First, the bias between TLS data and field data for each individual estimated variable can be assessed with the function relative.bias. Second, correlations between all estimated variables and metrics based on TLS-data (output data of the simulations function) and the variables estimated from field data can be calculated with the correlations function. This function calculates both the Pearson and Spearman correlation coefficients. To visualize the correlation coefficients, heat maps can be drawn with the optimize.plot.design function.

Plot simultaion and estimation of metrics and variables

The simulations function is applied as follows:

simulations <- simulations(tree.tls = tree.tls, tree.ds = tree.ds, tree.field = tree.field,
            plot.design = c("fixed.area", "k.tree", "angle.count"),
            plot.parameters = data.frame(radius.max = 25, k.max = 50, BAF.max = 4),
            scan.approach = "single", var.metr = list(tls = NULL, field = NULL),
            dbh.min = 4, h.min = 1.3, max.dist = Inf,
            dir.data = dir.data, save.result = FALSE, dir.result = NULL)

The input data frames

Both TLS and field data from the same sample plots are required to compute the function. The TLS data introduced in tree.tls should have the same format as the data frame returned from the tree.detection.single.scan and tree.detection.multi.scan functions. Thus, each row must correspond to a detected tree and it must contain at least the following columns: id, file, tree, x, y, phi.left, phi.right, horizontal.distance, dbh, num.points, num.points.hom, num.points.est, num.points.hom.est and partial.occlusion. The data frame containing the field data must be inserted in the argument tree.field. Similar to the TLS data table, each row must correspond to a tree (specified in the columns id and tree) and the values for horizontal.distance, dbh, total.height and an integer value indicating whether the tree is dead (1) or alive (NA, specified in dead) must be included in the data frame.

When the distance sampling method for correction of occlusion effects was applied (function distance.sample, see "Stand-level" vignette), a list with the results in the output data frames from the aforesaid function can be introduced in tree.ds. The list must contain at least the data frame tree with the detection probabilities (P.hn, P.hn.cov, P.hr and P.hr.cov) for each tree. By default tree.ds is set to NULL and as a result, the calculations of the variables based on occlusion correction will not be performed.

Specifying designs of simulated plots

A vector containing the names of the plot designs can specify the plot designs ("fixed.area", "k.tree", "angle.count") that are to be considered for the simulations in plot.design. By default, this argument is set to NULL and all three plots designs will be considered.

Furthermore the argument plot.parameters allows for manually specifying the design of the simulated plots. Many differently-sized plots of the plot designs specified in plot.design are simulated. The list introduced in plot.parameters can include the following elements to customize the generated plots. The elements radius.max, k.tree.max and BAF.max define the maximum radius (in m), the maximum number of trees and the maximum BAF (in m$^2$/ha) respectively to which the sizes of circular fixed area, k-tree and angle-count plots respectively should increase. By default the values are set to radius.max = 25, k.tree.max = 50 and BAF.max = 4. The increment by which the sizes of circular fixed area and angle-count plots sequentially increase can also be customized by specifying the elements radius.increment and BAF.increment respectively. The default settings are radius.increment = 0.1 (in m) and BAF.increment = 0.1 (in m$^2$/ha). An additional element of the list can be num.trees defining the number of dominant trees per hectare (trees/ha). This value is needed for the calculation of dominant diamters and heights and is set to 100 trees/ha by default.

Further adjustable arguments

Similar to the other functions, the scan approach can be specified in scan.approach and is by default set to "multi". Metrics and variables of interest can be defined as a vector in var.metr. Thus, only those metrics and variables named in the vector are calculated. If not specified, var.metr is set to NULL and all possible metrics and variables are computed. The arguments dbh.min, h.min and max.dist can optionally define the minimum $dbh$, height and maximum distance of a tree to be included in the calculations. The default values are dbh.min = 4 (in cm), h.min = 1.3 (in m) and max.dist = NULL (no maximal distance is considered).

The argument dir.data should specify the working directory of the .txt files with the normalized reduced point clouds (output of normalize function). If not specified, it is set to NULL and the current working directory is assigned to it. The output files will be saved by default, since the argument save.result is set to TRUE, to the directory path indicated in dir.result. For each plot design (circular fixed area, k-tree and angle-count plots), a .csv file is created using the write.csv function from the utils package.

Output of the simulations function

The simulations function generates a list with one element for each plot design. The elements are data frames containing the simulated plot. Each row represents a simulated plot defined by their respective plot identification number id and their size determined by either radius, k or BAF depending on the plot design. The columns N, G, V, V.user, W.user, d, dg, dgeom, dharm, h, hg, hgeom, hharm, d.0, dg.0, dgeom.0, dharm.0, h.0, hg.0, hgeom.0 and hharm.0 display the stand-level variables based on the field data. The remaining columns contain the stand-level variables and metrics estimated for each plot based on the TLS data. As an example the data frame for circular fixed area plots is shown below.

head(simulations$fixed.area)
kableExtra::scroll_box(kable_input = kableExtra::kable(head(Rioja.simulations$fixed.area), 
                                                       format = "html"), width = "100%")

As described above, plots with increasing sizes are estimated and the corresponding variables an metrics are calculated. The plots are ordered from the smallest to the biggest. Plots with small radius can not be simulated for all sample plots (see table above), since trees not always enter, because the plots are too small. But plots with e.g. a radius of 20 m, can be simulated for all sample plots (see end of table below).

tail(simulations$fixed.area)
kableExtra::scroll_box(kable_input = kableExtra::kable(tail(Rioja.simulations$fixed.area), 
                                                       format = "html"), width = "100%")

Calculation of relative bias

In order to estimate the goodness of the TLS-based estimation of the variables, the function relative.bias computes the relative bias between the variables estimated from field data and their respective TLS-based estimates. The relative bias is calculated for each sample plot and each simulated plot (i.e. different plot sizes and designs). Therefore, the input data for this function (introduced in simulations) must be a list of data frames containing the estimated variables (based on field and TLS data) for all the simulated plots. Thus, a similar list to the output of the simulations function (see above) is required, that has the same description and format.

Optionally, the variables for which the relative bias will be computed can be specified in a vector in variables. Only, the names of the field data based estimates can be introduced. If not otherwise specified, the argument will be set to c("N", "G", "V", "d", "dg", "d.0", "h", "h.0") by default. Other possible variables are dgeom, dharm, dg.0, dgeom.0, dharm.0, hg, hgeom, hharm, hg.0, hgeom.0 or hharm.0.

The arguments save.result and dir.result define whether and to which directory the output files should be saved. Two different output files are generated. First, the data frames for each plot design (as shown below for circular fixed areas) are saved as .csv files using the write.csv function from the utils package. Second, interactive line charts representing the relative biases are saved as .html files by means of the saveWidget function in the htmlwidgets package. An example of these interactive line charts is provided below.

bias <- relative.bias(simulations = Rioja.simulations,
              variables = c("N", "G", "d", "dg", "d.0", "h", "h.0"),
              save.result = FALSE, dir.result = NULL)

The function calculates the relative bias between the field data estimates (specified in variables) and the counterpart variables that are estimated based on TLS data. The TLS counterparts for the density (N) are the variables N.tls, N.hn, N.hr, N.hn.cov, N.hr.cov and N.sh for circular fixed area and k-tree plots, and N.tls and N.pam for angle-count plots. The same pattern applies to the basal area (G) and the volume (V) where the corresponding TLS-based estimates are G.tls, G.hn, G.hr, G.hn.cov, G.hr.cov, G.sh and G.pam, and V.tls, V.hn, V.hr, V.hn.cov, V.hr.cov, V.sh and V.pam respectively. In case of mean and dominant diameters (d, dg, dgeom, dharm, d.0, dg.0, dgeom.0, and dharm.0) and heights (h, hg, hgeom, hharm, h.0, hg.0, hgeom.0 and hharm.0), for all three plot designs their respective counterpart variables are d.tls, dg.tls, dgeom.tls, dharm.tls, d.0.tls, dg.0.tls, dgeom.0.tls and dharm.0.tls (for the diameter), and h.tls, hg.tls, hgeom.tls, hharm.tls, h.0.tls, hg.0.tls, hgeom.0.tls, hharm.0.tls and in addition P99 (for the height). The relative bias are calculated as follows

$$ \frac{\frac{1}{n} \sum_{i=1}^{n}y_i - \frac{1}{n}\sum_{i=1}^{n}x_i}{\sum_{i=1}^{n}x_i} $$

where $x_i$ is the value of the field estimate and $y_i$ the value of its TLS counterpart corresponding to plot $i$ of $n$ sample plots. For each plot size defined by the radius, k or BAF, the biases are calculated and stored as a data frame (shown below). Each row represents the a simulated plot of a certain size (here defined by radius) and the columns contain the calculate bias between the variables indicated in the column names. The two compared variables are joint with . as separation, e.g. N.N.tls means that the bias between N and N.tls was calculated.

head(bias$fixed.area)
kableExtra::scroll_box(kable_input = kableExtra::kable(head(bias$fixed.area), 
                                                       format = "html"), width = "100%")

For better visualization, line charts are created that show the relative bias of a given variable and plot design. As an example, the interactive graphic showing the relative bias of basal area estimations (G) for fixed are plots can be seen when following the link (RB.G.fixed.area.html).

Functions facilitating model-based or model-assisted sampling approaches

To facilitate the application of model-based sampling, two additional functions are included in the FORTLS package. The function correlations computes the correlations between variables estimated from field data and those estimated from TLS data and calculates the respective Pearson and Spearman correlation coefficients. The results are saved as .csv files and represented as line charts and heat maps (when applying the function optimize.plot.design).

Computing correlations

The correlations function computes the correlations for all the plot designs that are introduces as elements of a list in simulations. The format and description must be the same as the output list of the simulations function. Also similar to the function relative.bias, the variables for which the correlations are to be calculated can be specified in variables. By default, this argument is set to variables = c("N", "G", "V", "d", "dg", "d.0", "h", "h.0"). If only one of the two above-mentioned correlation measures should be calculated, it can be specified in method. This argument is set to method = c("pearson", "spearman") by default and both correlation coefficients are computed.

fixed.area.simulations <- list(fixed.area = Rioja.simulations$fixed.area[Rioja.simulations$fixed.area$radius < 7.5, ])
cor <- correlations(simulations = fixed.area.simulations,
             variables = c("N", "G", "d", "dg", "d.0", "h", "h.0"),
             method = c("pearson", "spearman"), 
             save.result = FALSE, dir.result = NULL)

In addition to the calculation of the correlation measures, the function also performs tests of association and returns the p-values. The output of this function is a list containing the following three elements correlations, correlations.pval and opt.correlations. Each of them is a list including, if not otherwise specified (in method), two elements pearson and spearman. These two elements are lists again that include separate data frames for each plot design (circular fixed area, k-tree and angle-count plots). The data frames contain the corresponding correlation coefficients (in correlations), the calculated p-values (in correlations.pval) and the optimal correlations for a given plot size and field data estimate (in opt.correlations).

cor$correlations$pearson$fixed.area[20:26,1:15]
kableExtra::scroll_box(kable_input = kableExtra::kable(
  cor$correlations$pearson$fixed.area[20:26,1:15], format = "html"), width = "100%")

All mentioned data frames are divided into rows each of which represent a given plot size defined by radius (for circular fixed area plots), k (for k-tree plots) or BAF (for angle-count plots). The columns of the data frames in correlations (shown above) and correlations.pval (shown below) contain the calculated coefficients and p-values respectively for the corresponding correlation. The column names are composed of the two variables (e.g. N and N.tls) separated by . (giving N.N.tls) that were correlated as described for the relative.bias function.

cor$correlations.pval$pearson$fixed.area[20:26,1:15]
kableExtra::scroll_box(kable_input = kableExtra::kable(
  cor$correlations.pval$pearson$fixed.area[20:26,1:15], format = "html"), width = "100%")

The data frames of the opt.correlations list (example shown below) are also divided into rows that represent the different plot sizes. For a given plot size and variable (specified in the argument variables), the best correlating TLS-based estimate and the corresponding correlation coefficient is displayed in this table. The columns named <variable>.metric (with <variable> being here N, G, d, dg, d.0, h and h.0) contain the TLS-based variable or metric that yielded the best correlation with the respective field data-based variable of the column name for a certain plot radius. The columns <variable>.cor display the measures of the respective correlations. That means, in the example shown here, the TLS-based estimate that yielded the best correlation with the field data based variable density (N) for circular fixed area plots with a radius of 4.4 m is ID.rho. And the correlation coefficient is 0.7286.

cor$opt.correlations$pearson$fixed.area[20:26,]
kableExtra::scroll_box(kable_input = kableExtra::kable(
  cor$opt.correlations$pearson$fixed.area[20:26,], format = "html"), width = "100%")

The correlations functions creates different files and saves them (if save.result = TRUE, default setting) to the directory indicated in dir.result. These files are, on the one hand, .csv files of the data frames in the lists correlations and opt.correlations created by means of the write.csv function from the utils package. These .csv files will be named as correlations.<plot design>.<method>.csv and opt.correlations.<plot design>.plot.<method>.csv with <plot design> being fixed.area.plot, k.tree.plot or angle.count.plot and <method> being pearson or spearman. On the other hand, interactive line charts representing the correlation coefficients will be created for each variable (selected in variables) as .html file using the saveWidget function in the htmlwidgets package. As an example, the interactive line chart for the variable height (h) and fixed area plots (pearson measure) is shown (correlations.h.fixed.area.pearson.html).

Visualizing correlations

In order to visualize the optimal correlations, the function optimize.plot.design creates heat maps and is applied as follows:

optimize.plot.design(correlations = cor$opt.correlations,
                     variables = c("N", "G", "d", "dg", "d.0", "h", "h.0"),
                     dir.result = NULL)

The function creates the heat maps based on the optimal correlation list (opt.correlations from the output of the correlations function) introduced in correlations. The introduced list must have the same format and description as the opt.correlation list. Similar to the other functions described above, the variables of interest can be selected in variables (default setting: variables = c("N", "G", "V", "d", "dg", "d.0", "h", "h.0")). This function generates interactive heat maps with the saveWidget function of the htmlwidgets package and saves these graphics to the directory indicated in dir.result (or by default to the working directory). For each plot design and correlation measure a plot is generated and named as opt.correlations.<plot design>.<method>.html where <plot design> can be fixed.area.plot, k.tree.plot or angle.count.plot and <method> either pearson or spearman according to the plot design and correlation measure.

As an example, the heat map of the pearson correlation coefficient for fixed area plots and pearson measure is provided and can be seen when opening the link (opt.correlations.fixed.area.pearson.html).



Try the FORTLS package in your browser

Any scripts or data that you put into this service are public.

FORTLS documentation built on Sept. 11, 2023, 5:09 p.m.