FIESTA - Model-Assisted Estimators

knitr::opts_chunk$set(warning = FALSE, message = FALSE)
# Sets up output folding
hooks = knitr::knit_hooks$get()
hook_foldable = function(type) {
  force(type)
  function(x, options) {
    res = hooks[[type]](x, options)

    if (isFALSE(options[[paste0("fold.", type)]])) return(res)

    paste0(
      "<details><summary>", type, "</summary>\n\n",
      res,
      "\n\n</details>"
    )
  }
}
knitr::knit_hooks$set(
  output = hook_foldable("output"),
  plot = hook_foldable("plot")
)
data.table::setDTthreads(2)

Model-Assisted (MA) module overview

FIESTA's Model-Assisted (MA) module calculates population estimates and their sampling errors by taking advantage of available model-assisted survey estimators from the mase R package (McConville, et al. 2018). These estimators can use a variety of auxiliary data to build models and predict over a response variable of interest, while using a bias-correction term so that the bias of the model does not depend on model mis-specification.

Functions in FIESTA used for fitting model-assisted estimators include the modMAarea function for area estimates and modMAtree for tree estimates. The modMApop function is used to get population data needed for model-assisted estimation. Below is a description and table of contents for the sections related to these functions:

FUNCTION | DESCRIPTION -------------- | --------------------------------------------------------------- modMApop | Creates population data for model-assisted estimation. modMAarea | Produces area level estimates through model-assisted estimation. modMAtree | Produces tree level estimates through model-assisted estimation.

Objective of tutorial

The main objective of this tutorial is to demonstrate how to use FIESTA for generating estimates using estimators from mase. The model-assisted estimators can be used with FIA's standard state-level population data (i.e, Evaluation) from the FIA database (FIADB) and also population data from a custom boundary.

The following examples are for generating estimates and estimated variances using standard FIA Evaluation data from FIA's National database, with custom Estimation unit and Stratification information. The examples use data from three inventory years of field measurements in the state of Wyoming, from FIADB_1.7.2.00, last updated June 20, 2018, downloaded on June 25, 2018 and stored as internal data objects in FIESTA.

Example data - Wyoming (WY), Inventory Years 2011-2012

View MA Example Data

Data Frame | Description -----------| -------------------------------------------------------------------------------- WYplt | WY plot-level data WYcond | WY condition-level data WYtree | WY tree-level data

External data | Description -------------------------| ------------------------------------------------------------------ WYbighorn_adminbnd.shp | Polygon shapefile of WY Bighorn National Forest Administrative boundary WYbighorn_districtbnd.shp| Polygon shapefile of WY Bighorn National Forest District boundaries WYbighorn_forest_nonforest_250m.tif| GeoTIFF raster of predicted forest/nonforest (1/0) for stratification WYbighorn_dem_250m.img | Erdas Imagine raster of elevation change, in meters

*USDA Forest Service, Automated Lands Program (ALP). 2018. S_USA.AdministrativeForest (http://data.fs.usda.gov/geodata/edw). Description: An area encompassing all the National Forest System lands administered by an administrative unit. The area encompasses private lands, other governmental agency lands, and may contain National Forest System lands within the proclaimed boundaries of another administrative unit. All National Forest System lands fall within one and only one Administrative Forest Area.

**USDA Forest Service, Automated Lands Program (ALP). 2018. S_USA.RangerDistrict (http://data.fs.usda.gov/geodata/edw). Description: A depiction of the boundary that encompasses a Ranger District.

***Based on MODIS-based classified map resampled from 250m to 500m resolution and reclassified from 3 to 2 classes: 1:forest; 2:nonforest. Projected in Albers Conical Equal Area, Datum NAD27 (Ruefenacht et al. 2008). Clipped to extent of WYbighorn_adminbnd.shp.

****USGS National Elevation Dataset (NED), resampled from 30m resolution to 250m. Projected in Albers Conical Equal Area, Datum NAD27 (U.S. Geological Survey 2017). Clipped to boundary of WYbighorn_adminbnd.shp.

Set up

First, you'll need to load the FIESTA library:

library(FIESTA)

Next, you'll need to set up an "outfolder". This is just a file path to a folder where you'd like FIESTA to send your data output. For our purposes in this vignette, we have saved our outfolder file path as the outfolder object in a temporary directory. We also set a few default options preferred for this vignette.

outfolder <- tempdir()

Get data for examples

View Getting Data

Now that we've loaded FIESTA and setup our outfolder, we can retrieve the data needed to run the examples. First, we point to some external data and predictor layers stored in FIESTA and derive new predictor layers using the terra package.

# File names for external spatial data
WYbhfn <- system.file("extdata", "sp_data/WYbighorn_adminbnd.shp",
                      package = "FIESTA")
WYbhdistfn <- system.file("extdata", "sp_data/WYbighorn_districtbnd.shp",
                          package = "FIESTA")

## predictor variables
fornffn <- system.file("extdata", "sp_data/WYbighorn_forest_nonforest_250m.tif",
                       package = "FIESTA")
demfn <- system.file("extdata", "sp_data/WYbighorn_dem_250m.img",
                     package = "FIESTA")

# Derive new predictor layers from dem
library(terra)
dem <- rast(demfn)
slpfn <- paste0(outfolder, "/WYbh_slp.img")
slp <- terra::terrain(dem,
                      v = "slope",
                      unit = "degrees",
                      filename = slpfn, 
                      overwrite = TRUE,
                      NAflag = -99999.0)
aspfn <- paste0(outfolder, "/WYbh_asp.img")
asp <- terra::terrain(dem,
                      v = "aspect",
                      unit = "degrees", 
                      filename = aspfn,
                      overwrite = TRUE,
                      NAflag = -99999.0)

Next, we can get our FIA plot data and set up our auxiliary data. We can get our FIA plot data with the spMakeSpatialPoints function from FIESTA. For more information on how to use this function, please see the sp vignette included with FIESTA (link).

WYspplt <- spMakeSpatialPoints(
  xyplt = WYplt,
  xy.uniqueid = "CN",
  xvar = "LON_PUBLIC",
  yvar = "LAT_PUBLIC",
  xy.crs = 4269
)

rastlst.cont <- c(demfn, slpfn, aspfn)
rastlst.cont.name <- c("dem", "slp", "asp")
rastlst.cat <- fornffn
rastlst.cat.name <- "fornf"

Next, we must generate dataset for model-assisted estimation. We can do this with the spGetAuxiliary function from FIESTA. Again, see the sp vignette for further information on this function.

modeldat <- spGetAuxiliary(
  xyplt = WYspplt,
  uniqueid = "CN",
  unit_layer = WYbhfn,
  unitvar = NULL,
  rastlst.cont = rastlst.cont,
  rastlst.cont.name = rastlst.cont.name,
  rastlst.cat = rastlst.cat,
  rastlst.cat.name = rastlst.cat.name,
  rastlst.cont.stat = "mean",
  asptransform = TRUE,
  rast.asp = aspfn,
  keepNA = FALSE,
  showext = FALSE,
  savedata = FALSE)
str(modeldat, max.level = 1)

Examples

modMApop

Example 1: Creating our population dataset with modMApop

View Example

We can create our population data for model-assisted estimation. To do so, we use the modMApop function in FIESTA. We must assign our population tables with the popTabs argument (and unique identifiers for these tables with the popTabIDs argument if they are not the default), the plot assignment with the pltassgn argument, and in auxiliary dataset we just created with the auxdat argument. The spGetAuxiliary function has done much of the hard work for us so far, so we can just run a simple call to modMApop:

MApopdat <- modMApop(popTabs = list(tree = WYtree, cond = WYcond),
                     auxdat = modeldat)

Note that the modMApop function returns a list with lots of information and data for us to use. For a quick look at what this list includes we can use the str function:

str(MApopdat, max.level = 1)

Now that we've created our population dataset, we can move on to estimation.

modMAarea

Example 2: Area of forest land, Wyoming, 2011-2013

View Example

In this example, we look at estimating the area of forest land in Wyoming from 2011 to 2013 summed to the population unit (sumunit = TRUE) with the generalized regression estimator (MAmethod = "greg"). FIESTA returns raw data for area of forest land, Wyoming, 2011-2013 (sum estimation units).

area1 <- modMAarea(
  MApopdat = MApopdat, # pop - population calculations for WY, post-stratification
  MAmethod = "greg", # est - model-assisted method
  landarea = "FOREST" # est - forest land filter
  )

We can look at the structure of this output with str and the estimates below. Note that again FIESTA outputs a list.

str(area1, max.level = 2)

area1$est

Example 3: Area of forest land, Wyoming, 2011-2013, using the Elastic Net for variable selection

View Example

Here, we fit the same model as the above example, but rather than using "greg" are our model-assisted method, we can use "gregEN" where the EN stands for "elastic net". The elastic net performs variable selection for us, grabbing predictors it finds to be most useful in the model.

area2 <- modMAarea(
  MApopdat = MApopdat, # pop - population calculations for WY, post-stratification
  MAmethod = "gregEN", # est - model-assisted method
  landarea = "FOREST", # est - forest land filter
  )

We can again see that the structure of the list is very similar to that in the above example:

str(area2, max.level = 2)

However now the raw list has an item call predselectlst. We can look at this item now:

area2$raw$predselectlst$totest

Notably, we can see that dem, slp, asp_cos, and asp_sin were removed from the model.

Example 4: Area by forest type on forest land, Wyoming, 2011-2013

View Example

In this example, we look at adding rows to the output and include returntitle=TRUE to return title information.

area3 <- modMAarea(
    MApopdat = MApopdat,         # pop - population calculations for WY, post-stratification
    MAmethod = "greg",           # est - model-assisted method
    landarea = "FOREST",         # est - forest land filter
    rowvar = "FORTYPCD",         # est - row domain
    returntitle = TRUE           # out - return title information
    )

Again, we can look at the contents of the output list. The output now includes titlelst, a list of associated titles.

str(area3, max.level = 1)

And the estimates:

## Estimate and percent sampling error of estimate
area3$est

Along with raw data and titles:

## Raw data (list object) for estimate
raw3 <- area3$raw      # extract raw data list object from output
names(raw3)
head(raw3$unit_totest) # estimates by estimation unit (i.e., ESTN_UNIT)
raw3$totest            # estimates for population (i.e., WY)
head(raw3$unit_rowest) # estimates by row, by estimation unit (i.e., ESTN_UNIT)
head(raw3$rowest)      # estimates by row for population (i.e., WY)


## Titles (list object) for estimate
titlelst3 <- area3$titlelst
names(titlelst3)
titlelst3

Example 5: Area by forest type and stand-size class on forest land, Wyoming, 2011-2013

View Example

In this example, we look at adding rows and columns to output, including FIA names. We also output estimates and percent standard error in the same cell with the allin1 argument in table_options and save data to an outfolder with the outfolder argument in savedata_options.

## Area of forest land by forest type and stand-size class, Wyoming, 2011-2013
area4 <- modMAarea(
    MApopdat = MApopdat,         # pop - population calculations for WY, post-stratification
    MAmethod = "greg",           # est - model-assisted method
    landarea = "FOREST",         # est - forest land filter
    rowvar = "FORTYPCD",         # est - row domain
    colvar = "STDSZCD",          # est - column domain
    savedata = TRUE,             # out - save data to outfolder
    returntitle = TRUE,          # out - return title information
    table_opts = list(
      row.FIAname = TRUE,          # table - row domain names
      col.FIAname = TRUE,          # table - column domain names
      allin1 = TRUE                # table - return output with est(pse)
      ),
    savedata_opts = list(
      outfolder = outfolder,       # save - outfolder for saving data
      outfn.pre = "WY"             # save - prefix for output files
      )
    )

area4$est

We can again look at the output list, estimates, raw data, and titles:

## Look at output list
names(area4)

## Estimate and percent sampling error of estimate
head(area4$est)


## Raw data (list object) for estimate
raw4 <- area4$raw      # extract raw data list object from output
names(raw4)
head(raw4$unit_totest) # estimates by estimation unit (i.e., ESTN_UNIT)
head(raw4$totest)      # estimates for population (i.e., WY)
head(raw4$unit_rowest) # estimates by row, by estimation unit (i.e., ESTN_UNIT)
head(raw4$rowest)      # estimates by row for population (i.e., WY)
head(raw4$unit_colest) # estimates by column, by estimation unit (i.e., ESTN_UNIT)
head(raw4$colest)      # estimates by column for population (i.e., WY)
head(raw4$unit_grpest) # estimates by row and column, by estimation unit (i.e., ESTN_UNIT)
head(raw4$grpest)      # estimates by row and column for population (i.e., WY)


## Titles (list object) for estimate
titlelst4 <- area4$titlelst
names(titlelst4)
titlelst4


## List output files in outfolder
list.files(outfolder, pattern = "WY_area")
list.files(paste0(outfolder, "/rawdata"), pattern = "WY_area")

modMAtree

We will set our estimate variable and filter now. We set estvar to "VOLCFNET" for net cubic foot volume, and filter with estvar.filter set to "STATUSCD == 1" so we only consider live trees in our estimation.

estvar <- "VOLCFNET"
live_trees <- "STATUSCD == 1"

Example 6: Net cubic-foot volume of live trees, Wyoming, 2011-2013

View Example

We now will generate estimates by estimation unit (i.e., ESTN_UNIT) and sum to population (i.e., WY) with modMAtree.

## Return raw data and titles
## Total net cubic-foot volume of live trees (at least 5 inches diameter), Wyoming, 2011-2013 
tree1 <- modMAtree(
    MApopdat = MApopdat,         # pop - population calculations
    MAmethod = "greg",           # est - model-assisted method
    landarea = "FOREST",         # est - forest land filter
    estvar = estvar,             # est - net cubic-foot volume
    estvar.filter = live_trees,  # est - live trees only
    returntitle = TRUE           # out - return title information
    )

names(tree1)
tree1$raw$unit_totest

Example 7: Net cubic-foot volume of live trees, Wyoming, 2011-2013, using the Elastic Net for variable selection

View Example

Here, we fit the same model as the above example, but rather than using "greg" are our model-assisted method, we can use "gregEN" where the EN stands for "elastic net". The elastic net performs variable selection for us, grabbing predictors it finds to be most useful in the model.

## Return raw data and titles
## Total net cubic-foot volume of live trees (at least 5 inches diameter), Wyoming, 2011-2013 
tree2 <- modMAtree(
    MApopdat = MApopdat,         # pop - population calculations
    MAmethod = "gregEN",         # est - model-assisted method
    landarea = "FOREST",         # est - forest land filter
    estvar = estvar,             # est - net cubic-foot volume
    estvar.filter = live_trees,  # est - live trees only
    returntitle = TRUE           # out - return title information
    )

We can again see that the structure of the list is very similar to that in the above example:

str(tree2, max.level = 2)

However now the raw list has an item call predselectlst. We can look at this item now:

tree2$raw$predselectlst

Notably, we can see that [INSERT CORRECT PREDS] dem, slp, asp_cos, and asp_sin were removed from the model.

Example 8: Net cubic-foot volume of live trees by forest type, Wyoming, 2011-2013

View Example

This example adds rows to the output for net cubic-foot volume of live trees (at least 5 inches diameter) by forest type, Wyoming, 2011-2013. We also choose to return titles with returntitle = TRUE.

tree3 <- modMAtree(
    MApopdat = MApopdat,         # pop - population calculations
    MAmethod = "greg",           # est - model-assisted method
    landarea = "FOREST",         # est - forest land filter
    estvar = "VOLCFNET",               # est - net cubic-foot volume
    estvar.filter = "STATUSCD == 1",   # est - live trees only
    rowvar = "FORTYPCD",         # est - row domain 
    returntitle = TRUE           # out - return title information
    )

Again, we investigate the output of the returned list:

## Look at output list
names(tree3)

## Estimate and percent sampling error of estimate
tree3$est

## Raw data (list object) for estimate
raw3 <- tree3$raw      # extract raw data list object from output
names(raw3)
head(raw3$unit_totest)   # estimates by estimation unit (i.e., ESTN_UNIT)
head(raw3$totest)        # estimates for population (i.e., WY)
head(raw3$unit_rowest)   # estimates by row, by estimation unit (i.e., ESTN_UNIT)
head(raw3$rowest)        # estimates by row for population (i.e., WY)


## Titles (list object) for estimate
titlelst3 <- tree3$titlelst
names(titlelst3)
titlelst3

We can also create a simple barplot from the output:

## Create barplot
datBarplot(
      raw3$unit_rowest, 
      xvar = titlelst3$title.rowvar, 
      yvar = "est"
      )

And a fancier barplot:

## Create fancier barplot
datBarplot(
      raw3$unit_rowest, 
      xvar = titlelst3$title.rowvar, 
      yvar = "est",
      errbars = TRUE, 
      sevar = "est.se", 
      main = FIESTAutils::wraptitle(titlelst3$title.row, 75),
      ylabel = titlelst3$title.yvar, 
      divideby = "million"
      )

Example 9: Net cubic-foot volume of live trees by forest type and stand-size class, Wyoming, 2011-2013

View Example

This examples adds rows and columns to the output, including FIA names, for net cubic-foot volume of live trees (at least 5 inches diameter) by forest type and stand-size class, Wyoming, 2011-2013. We also use the *_options functions to return output with estimates (est) and percent standard error (pse) in same cell - est(pse) with allin1 = TRUE and save data to an outfolder with savedata = TRUE and outfolder = outfolder.

tree4 <- modMAtree(
    MApopdat = MApopdat,         # pop - population calculations
    MAmethod = "greg",           # est - model-assisted method
    landarea = "FOREST",         # est - forest land filter
    estvar = "VOLCFNET",               # est - net cubic-foot volume
    estvar.filter = "STATUSCD  == 1",   # est - live trees only
    rowvar = "FORTYPCD",         # est - row domain
    colvar = "STDSZCD",          # est - column domain
    returntitle = TRUE,          # out - return title information
    savedata = TRUE,             # out - save data to outfolder
    table_opts = table_options(
      row.FIAname = TRUE,          # est - row domain names
      col.FIAname = TRUE,          # est - column domain names
      allin1 = TRUE                # out - return output with est(pse)
    ),
    savedata_opts = savedata_options(
      outfolder = outfolder,       # out - outfolder for saving data
      outfn.pre = "WY"             # out - prefix for output files
      )
    )

Again, we investigate the output of the returned list:

## Look at output list from modGBarea()
names(tree4)

## Estimate and percent sampling error of estimate
tree4$est


## Raw data (list object) for estimate
raw4 <- tree4$raw      # extract raw data list object from output
names(raw4)
head(raw4$unit_totest)   # estimates by estimation unit (i.e., ESTN_UNIT)
head(raw4$totest)        # estimates for population (i.e., WY)
head(raw4$unit_rowest)   # estimates by row, by estimation unit (i.e., ESTN_UNIT)
head(raw4$rowest)        # estimates by row for population (i.e., WY)
head(raw4$unit_colest)   # estimates by column, by estimation unit (i.e., ESTN_UNIT)
head(raw4$colest)        # estimates by column for population (i.e., WY)
head(raw4$unit_grpest)   # estimates by row and column, by estimation unit (i.e., ESTN_UNIT)
head(raw4$grpest)        # estimates by row and column for population (i.e., WY)


## Titles (list object) for estimate
titlelst4 <- tree4$titlelst
names(titlelst4)
titlelst4


## List output files in outfolder
list.files(outfolder, pattern = "WY_tree")
list.files(paste0(outfolder, "/rawdata"), pattern = "WY_tree")

Example 10: Number of live trees by species, Wyoming, 2011-2013

View Example

We can use tree domain in estimation output rows:

## Number of live trees (at least 1 inch diameter) by species
tree5 <- modMAtree(
    MApopdat = MApopdat,         # pop - population calculations
    MAmethod = "greg",           # est - model-assisted method
    landarea = "FOREST",         # est - forest land filter
    estvar = "TPA_UNADJ",               # est - number of trees per acre 
    estvar.filter = "STATUSCD == 1",    # est - live trees only
    rowvar = "SPCD",             # est - row domain
    returntitle = TRUE,          # out - return title information
    table_opts = table_options(    
      row.FIAname = TRUE,          # est - row domain names
      allin1 = FALSE               # out - return output with est and pse
      )
    )

We can also look at the output list and estimates again:

## Look at output list
names(tree5)

## Estimate and percent sampling error of estimate
tree5$est

Example 11: Number of live trees (plus seedlings) by species, Wyoming, 2011-2013

View Example

We can also add seedlings.

Note: seedling data are only available for number of trees (estvar = TPA_UNADJ).

Note: must include seedling data in population data calculations.

MApopdat_seed <- modMApop(popTabs = list(tree = WYtree,
                                         cond = WYcond,
                                         seed = WYseed),
                     pltassgn = WYpltassgn,
                     auxdat = modeldat)
## Number of live trees by species, including seedlings
tree6 <- modMAtree(
    MApopdat = MApopdat_seed,         # pop - population calculations
    MAmethod = "greg",           # est - model-assisted method
    estseed = "add",             # est - add seedling data
    landarea = "FOREST",         # est - forest land filter
    estvar = "TPA_UNADJ",               # est - number of trees per acre 
    estvar.filter = "STATUSCD == 1",    # est - live trees only
    rowvar = "SPCD",             # est - row domain
    returntitle = TRUE,          # out - return title information
    table_opts = table_options(
      row.FIAname = TRUE,          # est - row domain names
      allin1 = FALSE)              # out - return output with est and pse
    )

And again we can look at our outputs and compare estimates:

## Look at output list
names(tree6)

## Estimate and percent sampling error of estimate
tree6$est


## Compare estimates with and without seedlings
head(tree5$est)
head(tree6$est)



Try the FIESTA package in your browser

Any scripts or data that you put into this service are public.

FIESTA documentation built on Nov. 22, 2023, 1:07 a.m.