Background

This vignette aims to demonstrate the workflows used to perform contact analysis using the wildlifeDI package in R. Specifically, two datasets are used to show how the different functions for contact analysis can be used. The main contact analysis functions in the wildilfeDI package have all been given a 'con' prefix (e.g., conProcess) so that they clearly stand apart from the dynamic interaction indices and other functions available in the package.

Set Up

### Cacheing data for speed
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(cache = TRUE)

Libraries and Packages

library(wildlifeDI)
library(move2)
library(adehabitatLT)
library(ggplot2)
library(sf)
library(igraph)
library(dplyr)

Contact Analysis - Doe Deer Data

First let's take a look at the doe deer data.

data(does)
does

Next, we can make a quick plot of the deer does data to see it on a map. We will color each individual a seperate color.

ggplot(does) + 
  geom_sf(aes(color=mt_track_id(does)))

Processing contacts

We use the conProcess function to identify contacts first. We use a temporal threshold of $t_c$ = 15 minutes (based on the 30 minute tracking data) to define simultaneous fixes. A distance threshold of $d_c$ = 50 m (based on biologically relevant signals between deer and previous literature) was used to define contacts. The parameter dc must be specified in the correct units (i.e., those associated with the tracking dataset). The parameter tc needs to be specified in seconds. We can look at the distribution of all paired fixes (based on tc) to further explore whether our choice for dc makes sense.

dcPlot(does,tc=15*60,dmax=1000)

The red lines in the dcPlot are automatically generated using a natural breaks algorithm to find local minima. They are more for reference, and not to be used for empirical assessment. That being said, it appears that a choice of $d_c$=50 corresponds to the first local minima.

doecons <- conProcess(does,dc=50,tc=15*60)
table(doecons$contact)

We can see that a total of 493 contacts have been identified in the dataset.

Next we can arrange contacts between does into phases of continuous interaction using the function conPhase. A parameter $p_c$ is used to group contacts as belonging to the same phases separated by $p_c$ units in time. The parameter $p_c$ must be specified in seconds. The function conSummary can be used to summarize contacts and phases within the entire dataset to get some basic statistics. It computes how many contacts are observed, and in how many unique segments these occur in, as well as some other values regarding contact phase duration. Here $p_c$ = 60 minutes.

doephas <- conPhase(doecons, pc=60*60)

consum <- doephas |>
    st_drop_geometry() |>
    filter(!is.na(contact_pha)) |>
    dplyr::group_by(contact_pha) |>
    dplyr::summarise(
      nfix = n(),
      t1 = min(date),
      t2 = max(date),
      duration = max(date)-min(date),
      avg_d = mean(contact_d,na.rm=T),
      min_d = min(contact_d,na.rm=T),
      max_d = max(contact_d,na.rm=T)
    ) 
consum

From the phase analysis we can see that there are 98 contact phases (or periods) in the dataset, some contacts are fleeting lasting only a single fix, some are lengthy bouts of interactive behaviour, the longest lasting for r round(as.numeric(max(consum$duration)/3600),digits=1) hours. These summary stats can be customized and queried to answer specific questions related to contact phases.

For example, we can extract more detailed information about the timing and duration of phases within the dataset. We plot the frequency histogram of the initiation of contact phases (by hour) throughout the day and compare that to the frequency histogram of all contacts.

#contact phase initiaition tod
consum$tod <- as.numeric(as.POSIXlt(consum$t1)$hour + as.POSIXlt(consum$t1)$min / 60)  #convert POSIX to hours

conall <- doecons |>
  subset(contact == 1)
conall$tod <- as.numeric(as.POSIXlt(conall$date)$hour + as.POSIXlt(conall$date)$min / 60)  #convert POSIX to hours


h1 <- ggplot(consum,aes(tod)) + 
  geom_histogram(binwidth=1)
h2 <- ggplot(conall,aes(tod)) + 
  geom_histogram(binwidth=1)
h1
h2

We can see the clear diurnal pattern in when contacts occur, which corresponds to known activity peaks in deer.

Mapping Contacts

Using the built in functionality of 'move2' objects we can easily make maps of contacts. First plot the distribution of all contact points on top of the distribution of all GPS fixes.

ggplot() + 
  geom_sf(data=does,aes(color=mt_track_id(does))) +
  geom_sf(data=conall)

We can see that the contacts are clustered around certain locations when compared to all GPS telemetry fixes.

Next, lets map only the initiation of phases (i.e., the first fix in every phase).

pha_fir <- doephas |>
  filter(!is.na(contact_pha)) |>
  group_by(contact_pha) |>
  filter(row_number()==1)


ggplot() + 
  geom_sf(data=does,aes(color=mt_track_id(does))) +
  geom_sf(data=pha_fir)

Here we can see a small difference in the spatial pattern of the initiation of contact phases the original distribution of GPS fixes, and the locations of all contact points.

Finally, lets map the contact phases as lines.

pha_lin <- doephas |>
  filter(!is.na(contact_pha)) |>
  group_by(contact_pha) |>
  summarise(n = dplyr::n(),do_union=FALSE) |>
  filter(n > 1) |>
  st_cast("LINESTRING")

ggplot() + 
  geom_sf(data=does,aes(color=mt_track_id(does))) +
  geom_sf(data=pha_lin)

The map of the phases as lines can be used to provide different insight into the spatial structure of contact phases throughout the study area.

Derive the Contact Network

From a contact list it is relatively straightforward to derive the contact network. We are usually interested in one of two parameters:

The contact matrix can be asymmetric for counts in the case of irregular fixes and/or depending on how the $t_c$ parameter is chosen, but in many (or most) cases it should be symmetric. The contact matrix for rates is typically assymetric because individuals typically have different numbers of overall fixes in a given tracking dataset.

The input into this analysis is simply the output from the conProcess function which can produce a table of all contacts by setting return = 'contact'.

cons <- conProcess(does,dc=50,tc=15*60,return='contacts')


tab_cnt <- cons |>
  count(id1,id2)
gr <- graph_from_data_frame(tab_cnt,directed=FALSE)
E(gr)$weight <- tab_cnt$n

plot(gr)

Comparing contacts to non-contact fixes

We can study if the does behaved differently during contacts compared to other times. To do this we can compare contact fixes to non-contact fixes. Here we show two variables: percent forest cover (related to habitat) which was already present in the data and movement step-length (related to behaviour). We will look at fixes immediately before and after contacts and compare to all other fixes. The function conTimelag is a useful tool for calculating the time from any given fix to the nearest contact fix, which can then be used in subsequent analysis as shown below.

doephas$stepLength <- as.numeric(mt_distance(doephas))

# Calculate time to any contact fix
doephas <- conTimelag(doephas,def='all')

#categorize time to contact as immediately before, contact, after, or non contact (NA) 
#Should be tailored to individual dataset
doephas$dt_lev <-  cut(doephas$contact_timelag, breaks = c(-Inf,-45*60,-15*60,15*60,45*60,Inf), labels = c("Other","Before","Contact","After","Other"))
table(doephas$dt_lev)


ggplot(doephas, aes(x=dt_lev, y=pForest)) + 
  geom_boxplot() +
  labs(x='',y='Forest Cover (%)') 

The boxplots show a visible difference for percent forest cover between contacts and non-contact (other) fixes. Note there is an additional NA boxplot. In this case these refer to an individual that had no contacts with any other individual, and therefore, the function conTimelag returns NA values associated with all fixes for the time to nearest contact. These could be reclassified as 'other' but here it is interesting to at least consider them differently, as this individual seems to be more associated with high forest cover than the other individuals which have higher contact levels.

ggplot(doephas, aes(x=dt_lev, y=stepLength)) + 
  geom_boxplot() +
  labs(x='',y='Step-Length (m)') + 
  scale_y_continuous(trans='log10')

With step-length it is a little different. Visually it appears that the step lengths before contacts are a little bit higher than at other times (during contacts, after contacts, and during non-contacts).

Summary

The wildlifeDI package can be used to tackle a wide range of problems when performing contact analysis using wildlife tracking data. Specifically, it provides tools to process, manage, and analyze contacts spatially and temporally. It provides output data structures that are useful for integration in R's well established statistical modelling packages facilitating further statistical analyses.

Session Information

sessionInfo()


jedalong/wildlifeDI documentation built on April 13, 2024, 2:20 p.m.