knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This vignette presents the sampling and ABA modeling approaches that were used for the enhanced forest inventory (EFI) pilot project at Romeo Malette Forest (RMF) using Leica SPL100 lidar (SPL) data collected wall-to-wall during the summer of 2018. The package (and this vignette) are written for the RMNF project but the intent is to make the overall approach applicable to other forest management units.
First, we will perform a structurally-guided stratified sampling to select locations for new plots establishment. Then, we will detail how plot-level forest attribute data can be calculated or modeled from published allometric equations. Finally, we will use the field inventory data and the SPL point cloud to build random forest predictive models and generate wall-to-wall maps of forest attributes.
We start by loading the packages that we are going to use
library(RMFinventory) library(raster) library(sf) library(dplyr) library(RStoolbox) library(ggplot2)
For this section we assume that wall-to-wall lidar metrics are available:
input <- "G:/ROMEO/SPL_processing/2_Metrics/2_mosaic" flist <- list.files(input, pattern = "tif$", full.names = TRUE) test <- stack(flist) wall_metrics <- brick(system.file("extdata","wall_metrics_small.tif", package = "RMFinventory")) names(wall_metrics) <- c("avg", "cov", "std","p10", "p20","p50","p70","p95", "p99","d0","d2","d4","dns") plot(wall_metrics$avg) #Coordinate system of wall-to-wal raster wall_crs <- raster::crs(wall_metrics)
We will only select cells that are suitable for plot establishment using the followqign process:
poly <- st_read(system.file("extdata/inventory_polygons","inventory_polygons.shp", package = "RMFinventory")) #Match CRS of wall-to-wall raster layer poly <- st_transform(poly, wall_crs@projargs) plot(poly["POLYTYPE"], axes = TRUE) plot(poly["OWNER"], axes = TRUE) poly_subset <- poly[poly$POLYTYPE == "FOR" & poly$OWNER == 1, ] plot(st_geometry(poly_subset), axes = TRUE, col = "red") poly_subset <- st_union(poly_subset) plot(st_geometry(poly_subset), axes = TRUE, col = "red") #Transform to SpatialPolygonDataFrame object for compatibility with the raster package poly_subset <- sf::as_Spatial(poly_subset) #Mask wall-to-wall layer wall_poly <- raster::mask(wall_metrics, mask = poly_subset) plot(wall_poly$avg)
# Open roads layer, select suitable road types and create a buffer roads <- st_read(system.file("extdata/roads","roads.shp", package = "RMFinventory")) roads <- st_transform(roads, wall_crs@projargs) plot(roads["RDTYPE"]) # Select only suitable road types # Highway (H), Municipal (M), Primary (P), Branch (B), Clay/mineral surface haul (C) and roads accessed when dry or frozen (d) , graveled (g), industrial grade road (i), highway or municipal road(r), yearly accessible (y) roads_subset <- roads[roads$RDTYPE %in% c("H", "M", "P", "B", "C") & roads$RDMOD %in% c("d", "g", "i", "r", "y"), ] # Dissolve roads layer to work with only 1 feature roads_subset <- st_union(roads_subset) plot(st_geometry(roads_subset)) #Create buffers of 30 m and 200 m roads_buffer_30m <- st_buffer(roads_subset, dist = 30) roads_buffer_200m <- st_buffer(roads_subset, dist = 200) #Take the symetrical difference between buffers to keep only roads >- 30 m AND <= 200 m roads_buffer <- st_sym_difference(roads_buffer_200m, roads_buffer_30m) plot(st_geometry(roads_buffer)) #Transform to SpatialPolygonDataFrame object for compatibility with the raster package roads_buffer <- sf::as_Spatial(roads_buffer) #Mask wall-to-wall layer wall_poly_roads <- raster::mask(wall_poly, mask = roads_buffer) plot(wall_poly_roads$avg)
In this example we only mask by inventory polygons and distance to road but other layers can be considered (such as past or planned depletions). The same approach as the one described above can be used.
When generating forest inventory attributes using an area-based-approach (ABA), the ground plots data used to calibrate ABA regression models need to be representative as much as possible of the full range of forest structure variability within the study area. If this is not the case, regression models might perform poorly in underrepresented forest types (White et al., 2013). Lidar metrics such as height percentiles, cover or height variability can be used to design a sampling network driven by forest structure.
Principal Component Analysis (PCA) is a method used to summarize the variability of a large number of highly correlated LiDAR structural metrics into a smaller number of uncorrelated variables. The feature space created by the generated principal components can then be stratified into classes that will represent specific types of forest structural conditions. Random sampling can then be performed within each of these classes to ensure a representative characterization of all forest structures occurring across the study area.
We assume that a set of lidar metrics have been calculated across the entire study area and that candidate cells for new plots establishment have been determined using criteria such as distance and access to roads, forested land cover etc.
Note that the resolution of the lidar metrics maps is determined based on the size of the plots that will be established. It is important the the area of each cell (pixel) corresponds to the area of the plots. For fixed-radius circular plots of 11.28 m radius, corresponding to 400 m2, the resolution should be 20 m x 20 m.
In the example of RMF, we selected only cells located between 30 m and 200 m from roads and forest inventory polygons determined from photo interpretation, were used to select only cells considered to be forested.
# Plot wall-to-wall plot(wall_metrics$p95, zlim = c(0, 30), main = "Wall to wall") candidate_metrics <- wall_poly_roads #From previous processing names(candidate_metrics) <- c("avg", "cov", "std","p10", "p20","p50","p70","p95", "p99","d0","d2","d4","dns") # Plot candidate metrics plot(candidate_metrics$p95, zlim = c(0, 30), main = "Candidates")
We calculate PCA values from the lidar metrics located in forested cells using the rasterPCA
function from the RStoolbox
package.
poly <- st_read("E:/ROMEO/RMF_Pilot_Data/Scripts/RMFinventory/inst/extdata/inventory_polygons/inventory_polygons.shp") #Match CRS of wall-to-wall raster layer poly <- st_transform(poly, wall_crs@projargs) #Select FOR polygons and transform to sp poly_FOR <- poly[poly$POLYTYPE == "FOR", ] wall_metrics_FOR <- mask(wall_metrics, poly_FOR) PCA <- RStoolbox::rasterPCA(wall_metrics_FOR, nComp = 2, spca = TRUE, maskCheck = FALSE) # nComp = 2, we return the two first principal components # spca = TRUE, since metrics have different ranges of values the fucntion will center and scale the metrics # maskCheck = FALSE, we don't check if some pixels have NA value in one or more layers. If not sure, set to TRUE # The output of rasterPCA is a list with an element model which contains the PCA model and an element map which contains the map of PCA values PCA_wall <- PCA$map plot(PCA_wall) PCA_model <- PCA$model
We can check the porportion of variance contained in each principal component:
summary(PCA_model)
A network of 182 plots was already established in RMF. The objective of the structural guided sampling was to check if the existing plot network was covering the entire range of structural variability and if not, selecting new plots in underrepresented forest types.
We load the existing set of plots. All the ALS metrics that were calculated to make the PCA model have also been calculated at the plot-level.
Note: Make sure that the name of metrics used to get the PCA model correspond to the name of the plot-level metrics
We can use the predict
function to get PCA values of the existing set of plots and off the candidate cells
# Plot-level plots_metrics <- st_read(system.file("extdata", "plots_metrics.shp", package = "RMFinventory")) plots_metrics_coords <- st_coordinates(plots_metrics) plots_metrics_df <- plots_metrics %>% st_drop_geometry() plots_PCA <- as.data.frame(predict(PCA_model, plots_metrics_df))[,c(1,2)] colnames(plots_PCA) <- c("PC1", "PC2")
#Candidate cells # This is the sricpt that would be ran. Since candidates are just a subset of the wall to wall cells, we could also just mask the PCA_wall layer cells tha are not part of the candidate cells. PCA_candidates <- raster::predict(candidate_metrics, model = PCA_model, index = c(1,2), filename = "path/to/file.tif")
PCA_candidates <- raster::predict(object = candidate_metrics, model = PCA_model,index = c(1,2)) names(PCA_candidates) <- c("PC1", "PC2") #Also possible to mask wall to wall PCA using a candidate metrics layer #PCA_candidates <- raster::mask(PCA_wall, candidate_metrics$avg)
Once PCA values of all forested cells, candidate cells and existing plots have been calculated, it is useful to plot them to visualize their distribution.
# In the following steps we get raster cells values in a data.frame # With large raster object it can creates very large objects in memory df_PCA_wall <- as.data.frame(PCA_wall, na.rm = TRUE, xy = FALSE) df_PCA_candidates <- as.data.frame(PCA_candidates, na.rm = TRUE, xy = FALSE) ## Function to calculate point density in scatterplot get_density <- function(x, y, ...) { dens <- MASS::kde2d(x, y, ...) ix <- findInterval(x, dens$x) iy <- findInterval(y, dens$y) ii <- cbind(ix, iy) return(dens$z[ii]) } #Calculate point density for scatterplot visualization df_PCA_candidates <- df_PCA_candidates %>% mutate(dens = get_density(PC1, PC2, n = 300)) # Get convex hulls hulls_wall_idx <- chull(df_PCA_wall$PC1, df_PCA_wall$PC2) hulls_wall <- dplyr::slice(df_PCA_wall[,c("PC1","PC2")],hulls_wall_idx) hulls_cand_idx <- chull(df_PCA_candidates$PC1, df_PCA_candidates$PC2) hulls_cand <- dplyr::slice(df_PCA_candidates[,c("PC1","PC2")],hulls_cand_idx) ggplot(mapping = aes(x = PC1, y = PC2)) + geom_polygon(data = hulls_wall, colour = "#440154FF", fill ="#440154FF", alpha = 0.2) + geom_polygon(data = hulls_cand, colour = "orange", fill = NA) + geom_point(data = df_PCA_candidates, aes(color = dens)) + scale_color_viridis_c() + geom_point(data = plots_PCA) + theme_bw() + coord_equal() + theme(panel.grid = element_blank(), legend.position = "right") #Function to calculate point density in scatterplot get_density <- function(x, y, ...) { dens <- MASS::kde2d(x, y, ...) ix <- findInterval(x, dens$x) iy <- findInterval(y, dens$y) ii <- cbind(ix, iy) return(dens$z[ii]) }
Once PCA values have been calculated we can stratify the PC1 / PC2 feature space. The approach chosen for the RMF was to stratify using equal intervals.
From the previous plot, we can decide to stratify the feature space into 7 equal interval on the PC1 axis and 5 intervals on the PC2 axis (7 x 5 grid). We can use the function getPCAstrata
to automatically create the stratification:
# Determine the number of breaks for the first and second features (PC1 and PC2) strata <- RMFinventory::getPCAstrata(PCA_layer = PCA_candidates, nbreaks = c(8, 6), #Since we want a 7 x 5 grid we need 8 x 6 breaks summary = TRUE) #getPCAstrata returns a list of three objects strata_candidates <- strata$strata_layer strata_candidates plot(strata_candidates) breaks <- strata$breaks breaks strata$summary strata$matrix
We can use the breaks returned by the getPCAstrata
function to get the strata of the existing set of plots:
strata <- getPCAstrata(PCA_layer = plots_PCA, breaks = breaks, # From the precedent call to getPCAstrata summary = TRUE) strata_plots <- strata$strata_layer strata_plots strata$summary
Make a new plots with break lines between strata
ggplot(mapping = aes(x = PC1, y = PC2)) + geom_polygon(data = hulls_wall, colour = "#440154FF", fill ="#440154FF", alpha = 0.2) + geom_polygon(data = hulls_cand, colour = "orange", fill = NA) + geom_point(data = df_PCA_candidates, aes(color = dens)) + scale_color_viridis_c() + geom_point(data = plots_PCA, color = "red") + geom_vline(xintercept = breaks$PC1, linetype = "dashed") + geom_hline(yintercept = breaks$PC2, linetype = "dashed") + theme_bw() + coord_equal() + theme(panel.grid = element_blank(), legend.position = "right")
The function sampleCells
that performs the sampling requires the number of cells to sample for each strata. This can be determined based on the previous feature space plots, total number of plots than can be selected, number of existing plots, how many of them should be re-measured etc.
The following figure illustrates the sampling process:
knitr::include_graphics(system.file("extdata","figures","Flow_RMF_sampling.png", package = "RMFinventory"))
toSample <- read.csv(system.file("extdata","cells_to_sample.csv", package = "RMFinventory"), stringsAsFactors = F) toSample existing_plots <- data.frame(plotID = plots_metrics$plotID, x = plots_metrics_coords[,1], y = plots_metrics_coords[,2], strata = strata_plots) existing_plots new_plots <- RMFinventory::sampleCells(strata_layer = strata_candidates, matrix_strata = strata$matrix, existing_sample = existing_plots, # You can provide a set of existing plots or output of previous call to sampleCells and these cells won't be sampled again. toSample = toSample, mindist = 150, message = T) # There might be a warning like this: #no non-missing arguments to min; returning Infno non-missing arguments to max; returning -Inf # If think safe to ignore but need to look more into that new_plots #Assign plotID to sampled plots new_plots <- new_plots %>% mutate(plotID = ifelse(type == "New", paste0("S_", seq_len(n())), plotID)) #Convert to sf object new_plots_sf <- st_as_sf(new_plots, coords = c("x", "y")) plot(strata_candidates) plot(st_geometry(new_plots_sf), add = T)
#Get metrics and PCA of new plots new_plots_metrics <- extract(wall_metrics, new_plots_sf) #could also get from point cloud directly new_plots_PCA <- as.data.frame(predict(PCA_model, new_plots_metrics))[,c(1,2)] colnames(new_plots_PCA) <- c("PC1", "PC2") ggplot(mapping = aes(x = PC1, y = PC2)) + geom_polygon(data = hulls_wall, colour = "#440154FF", fill ="#440154FF", alpha = 0.2) + geom_polygon(data = hulls_cand, colour = "orange", fill = NA) + geom_point(data = df_PCA_candidates, aes(color = dens)) + scale_color_viridis_c() + geom_point(data = new_plots_PCA, color = "red") + geom_vline(xintercept = breaks$PC1, linetype = "dashed") + geom_hline(yintercept = breaks$PC2, linetype = "dashed") + theme_bw() + coord_equal() + theme(panel.grid = element_blank(), legend.position = "right")
To be completed
Plot-level ALS metrics need to be calculated in order to build predictive models of forest attributes. These task can be performed in various softwares/packages. Here, we present two of them: LAStools and the lidR R package.
The lidR package offers more flexibility to calculate user-defined metrics. At the plot-level, computing time difference between lidR and LAStools is not significant. However, ALS metrics also need to be calculated wall-to-wall in order to generate forest attribute maps once predictive models have been developed on plot data. Over larger areas, LAStools will generally be more efficient than lidR.
In the end, the choice between lidR and LAStools is a trade-off between flexibility in ALS metrics and computational efficiency when calculating wall-to-wall maps.
The functionalities of the lidR package are well described in the following wiki: https://github.com/Jean-Romain/lidR/wiki.
The Area-based approach from A to Z is a great resource to have a look at for ABA.
lastiles <- lidR::readLAScatalog("path/to/ALS tiles") plots <- st_read("path/to/plots_coordinates.shp") opt_output_files(lastiles) <- "path/to/output/plot_{PlotID}" # Here plotID is a field of plots_coordinates.shp plot_radius = 11.28 plots_las <- lasclip(lastiles, plots, radius = plot_radius)
To be completed
To be completed
To be completed
It is possible to execute operative system commands through R using the system
function by passing it a character containing the command.
files <- "path/to/lasfiles" odir <- "path/to/output/file.csv" lastools_cmd <- paste0("lascanopy -i ", files, "*.las -files_are_plots -height_cutoff 1.3 -cover_cutoff 2 -p 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 99 -avg -cov -dns -std -kur -ske -d 2 5 10 15 20 25 -names -o ", odir) system(lastools_cmd)
The makeModels
function implemented in the package builds random forest predictive models for each forest attribute and assess their accuracy using a k-fold cross-validation.
# Open plot-level ALS metrics als_metrics <- read.csv(system.file("extdata","plot_metrics.csv", package = "RMFinventory")) head(als_metrics) toKeep <- colnames(als_metrics)[colnames(als_metrics) %in% names(wall_metrics)] als_metrics <- als_metrics[,c("plotID", toKeep)] pred_names <- colnames(als_metrics)[!colnames(als_metrics) %in% "plotID"] pred_names # Open plot-level forest attributes plot_data <- read.csv(system.file("extdata","tblPlotLevelSummary.csv", package = "RMFinventory")) head(plot_data) # Combine field data and ALS metrics combine_dat <- merge(als_metrics, plot_data, by = "plotID") # Attributes to model attNames <- c("top_height", "ba_ha", "V_ha") # Title to add to plots (optional) titles <- list(ba_ha = expression("Basal area (m"^"2"*"/ha)"), top_height = "Top height (m)", V_ha = expression("Whole stem volume (m"^"3"*"/ha)")) # Run modeling: k-fold cross validation, random forest regression models and accuracy assessment rfList <- RMFinventory::makeModels(dat = combine_dat, attNames = attNames, preds = pred_names, k = 5, titles = titles, saveModel = F, saveFigure = F) rfList$accuracy$`all folds` rfList$accuracy$summary # The accuracy measures reported on the plot are currently not extracted from rfList$accuracy$summary as they should be. They are calculated based on all plotted points. # The mean (and std) of accuracy measures of the cross-validation are better indicators of the "true" accuracy assessment. rfList$top_height$plot rfList$ba_ha$plot rfList$V_ha$plot
# Wall to wall metrics (already opened at the beginnign of the vignette) wall_metrics <- brick(system.file("extdata", "wall_metrics_small.tif", package = "RMFinventory")) names(wall_metrics) <- c("avg", "cov", "std","p10", "p20","p50","p70","p95", "p99","d0","d2","d4","dns")
top_height_wall <- raster::predict(wall_metrics, rfList$top_height$finalModel) ba_ha_wall <- raster::predict(wall_metrics, rfList$ba_ha$finalModel) V_ha_wall <- raster::predict(wall_metrics, rfList$V_ha$finalModel) plot(top_height_wall, main = titles$top_height) plot(ba_ha_wall, main = titles$ba_ha, zlim = c(0, 70)) plot(V_ha_wall, main = titles$V_ha, zlim = c(0, 500))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.