options(digits = 3)

knitr::opts_chunk$set(
  comment = "#>",
  collapse = TRUE,
  cache = FALSE,
  out.width = "70%",
  fig.align = "center",
  fig.height = 9,
  fig.width = 6,
  fig.asp = 0.618,  # 1 / phi
  fig.show = "hold"
)

This article shows how to determine habitat-species associations with the function tt_test(), developed by Sabrina Russo, Daniel Zuleta, Matteo Detto, and Kyle Harms.

Setup

First, install and load ("open") the relevant packages.

# install.packages("remotes")
remotes::install_github("forestgeo/fgeo")
library(dplyr)
library(ggplot2)
library(fgeo.x)
library(fgeo.tool)
library(fgeo.analyze)
# For reproducible results
set.seed(1014)

Load census and habitat data

We will use example datasets that come with fgeo.analyze.

census <- fgeo.x::download_data("luquillo_tree6_random")

str(census)
# Creating habitat data from elevation data
elevation <- fgeo.x::elevation
habitat <- fgeo_habitat(elevation, gridsize = 20, n = 4)
str(habitat)
fgeo.plot::autoplot(habitat)

To load your own data, you may run something like this:

load("PATH/CENSUS_DATA.rdata")
census <- CENSUS_DATA

load("PATH/HABITAT-DATA.rdata")
habitat <- HABITAT-DATA

Pick data

We will pick alive trees, of 10 mm or more, and of sufficiently abundant species.

pick <- filter(
  census,
  # Keep only alive
  status == "A", 
  # Keep dbh of 10 mm or more (drops missing dbh)
  dbh >= 10
)
# Count number of rows per species
pick <- add_count(pick, sp)
# Keep sufficiently abundant trees
pick <- filter(pick, n > 50)

# Summary
unique(select(pick, sp, n))

Overview

Before testing, we can overview the relationship between species and habitats with a plot.

# Tweaks
offset <- 20 / 2
habitat2 <- mutate(
  habitat, 
  # Center species and habitat data
  x = gx + offset, 
  y = gy + offset,
  # From continuous to categorical
  habitats = as.factor(habitats)
)
ggplot(pick, aes(x = gx, y = gy)) +
  geom_raster(data = habitat2, aes(x, y, fill = habitats)) + 
  geom_point() +
  coord_fixed() +
  facet_wrap(~sp) +
  labs(fill = "Habitat")

tt_test() and any number of species

tt_test_result <- tt_test(pick, habitat)
tt_test_result

To help you interpret the results, you can use summary().

summary(tt_test_result)

You may want to combine the output into a single matrix by row-binding each element of the results-list.

Reduce(rbind, tt_test_result)

You also can gather all results into a single dataframe -- this lets you use a wide range of tools for data manipulation and visualization.

as_tibble(tt_test_result)

You can benefit from storing your results in a dataframe. Compared to a matrix, a dataframe fits better in common workflows for data manipulation and visualization. The dataframe is the most important data structure used in dplyr, ggplot2, and many other packages. Here are some examples of what you can do with our dataframe output. (The next few code chunks use the pipe operator (%>%) to avoid saving intermediary results and to make our code more expressive -- where each line is an imperative statement that communicates our intention.)

as_tibble(tt_test_result) %>% 
  filter(sp == "CASARB")
as_tibble(tt_test_result) %>% 
  group_by(sp, habitat) %>% 
  summarize(total_stem_count = sum(N.Hab))


forestgeo/fgeo.analyze documentation built on Dec. 10, 2020, 3:24 p.m.