trackdf can handle multiple types of tracking data (in particular those generated by GPS units and video-tracking software) and multiple data frame classes (base::data.frame, tibble::tibble, and data.table::data.table). This is a design choice meant to accommodate the data processing pipelines of a maximum of users. It lets you use your favorite data manipulation paradigm (base R, dplyr/tidyverse, or data.table) while still standardizing the data format across studies and applications.

A consequence of that versatility, however, is that building a "track table" (the name we give to the structure that will hold your tracking data) requires a little bit of extra work from you (but just a little bit). This vignette covers building a track table from raw data generated by automated video-tracking software and GPS collars, for instance.

2.1 - Anatomy of a track table

At its core, a track table is just a wrapper around a data frame structure, as defined by one of the three main data frame classes in R: base::data.frame, tibble::tibble, and data.table::data.table. The choice of which data frame class is used underneath a track table is entirely your choice and depends on your preference for one or the other framework. trackdf will remember that choice and do its best to maintain it throughout your data analysis pipeline.

A track table is a specialized version of a data frame structure aimed at storing specifically tracking data, that is positions over time, of one or more individuals. In order to do that, trackdf imposes a few constraints on the construction of a track table over a traditional data frame. First, a track table must have at least the 4 following named columns:

You can then add as many other columns as you want to store other data relevant to your work but these 4 columns (+ the optional z columns) are required in a track table object.

In addition to these columns, a track table contains two additional attributes that are necessary for certain functions of the package:

Sounds complicated? Don't worry, trackdf provides a function to build track tables with just a little bit of input from you. See the rest of the vignette below.

2.2 - Building a track table from video-tracking data

Most video-tracking software generate outputs with information about the identity of each tracked individual, their position in some form of Euclidean space (using pixel coordinates or coordinates relative to the dimensions of the experimental setup), and the time of each observation (e.g., the frame number in a video). They can also contain other forms of information relevant to the work and we will also see here how to import them into a track table.

First, let's load some data that was generated using the trackR video-tracking software:

raw <- read.csv(system.file("extdata/video/01.csv", package = "trackdf"))
print(raw, max = 10 * ncol(raw))

This data frame contains 8 columns. The positions are stored in the x and y columns as pixel coordinates. Time is store in the frame column as a frame number of the video the data was collected from. The identity of each tracked individual is stored in track_fixed (the track column contains the identities before manual inspection and correction; id can be ignored for the purpose of this tutorial).

From this raw data, you can create a track table using the track function as follows:

library(trackdf)

tt <- track(x = raw$x, y = raw$y, t = raw$frame, id = raw$track_fixed)
print(tt, max = 10 * ncol(tt))

track outputs a few warnings, all related to the time component that we provided it. Indeed, we provided it with frame numbers that track doesn't know how to convert to date-time POSIXct objects and, therefore, defaulted to using now has the start of the experiment, UTC as the time zone, and 1 second as the time between two consecutive observations. We can, however, help track by provided the missing information into the origin (start of the experiment), tz (the time zone), and period (time between two successive observations) parameter of the function:

tt <- track(x = raw$x, y = raw$y, t = raw$frame, id = raw$track_fixed, 
            origin = "2019-03-24 12:55:23", 
            period = "0.04S", # 1/25 of a second
            tz = "America/New_York")
print(tt, max = 10 * ncol(tt))

If you would like to include in the track table some of the additional data contained in your raw data, it is as simple as adding extra columns when creating data frames. For instance, let's include the ignore data from the raw data set:

tt <- track(x = raw$x, y = raw$y, t = raw$frame, id = raw$track_fixed, 
            ignore = raw$ignore,
            origin = "2019-03-24 12:55:23", 
            period = "0.04S", # 1/25 of a second
            tz = "America/New_York")
print(tt, max = 10 * ncol(tt))

Finally, track default to using base::data.frame as its data frame class for storing the data. If you prefer to work with tibble::tibble or data.table::data.table, you can specify this in the track function as follows.

For tibble::tibble:

tt <- track(x = raw$x, y = raw$y, t = raw$frame, id = raw$track_fixed, 
            ignore = raw$ignore,
            origin = "2019-03-24 12:55:23", 
            period = "0.04S", # 1/25 of a second
            tz = "America/New_York",
            table = "tbl")
print(tt)

For data.table::data.table:

tt <- track(x = raw$x, y = raw$y, t = raw$frame, id = raw$track_fixed, 
            ignore = raw$ignore,
            origin = "2019-03-24 12:55:23", 
            period = "0.04S", # 1/25 of a second
            tz = "America/New_York",
            table = "dt")
print(tt)

2.3 - Building a track table from GPS data

Building a track table from geographic data follows similar principles, except that track also expect to receive information about the coordinate reference system the data is using. You can pass that information to track using the proj parameter of the function. But first, let's load some data that was generated by a GPS collar worn by a goat in Namibia:

raw <- read.csv(system.file("extdata/gps/02.csv", package = "trackdf"))
print(raw, max = 10 * ncol(raw))

track uses sf::st_crs to interpret information about coordinate reference systems. Therefore, you any format accepted by sf::st_crs to specify the coordinate reference system can be used with track. For data generated using GPS units, the character string "+proj=longlat" is often all that's needed.

We can then create our GPS-based track table as follows:

tt <- track(x = raw$lon, y = raw$lat, t = paste(raw$date, raw$time), id = 1,  
                proj = "+proj=longlat", tz = "Africa/Windhoek")
print(tt, max = 10 * ncol(tt))

Note that because our raw data already contains dates and times of the observations, we can simply combine them with paste and pass the result to track that will interpret them automatically.

Everything else works similarly to what was shown in the previous section about video-tracking data. The tutorial about manipulating data stored in a track table is provided in a separate vignette.



swarm-lab/trackdf documentation built on March 27, 2023, 2:13 a.m.