library(move2) library(assertthat) knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
move2
objectit is based on the sf
objects and compatible with a lot of
dplyr
/tidyverse
based functionality
information of non location data (other sensors as e.g. acceleration, magnetometer,etc) are associated to an empty locations.
track attributes and event attributes are distinguished. event
attributes are attributes associated to each recorded event (location or non
location), these will at least have a time and track id associated to them.
track attributes are attributes associated to each track (e.g. individual,
species, sex, etc), these will at least contain the track id, and can be
retrieved with the function mt_track_data()
To be able to expand and use the object in move2
it is important to understand
how the objects is structured. Here we explain some of the choices and explain
the requirements.
A move object in move2
uses the S3
class system, this is less rigors then
the S4
system that was used in the original move
package. The objects are
based on the sf
objects from the sf
package. This change is inspired by
several factors, first by basing on sf
we are able to profit from the speed
and improvements that went into that package, second it makes it directly
compatible with a lot of dplyr
/tidyverse
based functionality. To ensure
information specific to movement is retrained we use attributes. This is in a
fairly similar style to sf
.
To facilitate working with the associated sensor data we store other records
with an empty point. This means, for example, acceleration and activity
measurements can be part of the same tbl
/data.frame
.
The sf
package and sf
in general allow to store coordinates as three
dimensional records. As the altitude of tracking devices is typically much less
accurate, few functions actually support this functionality we do not use it at
this time.
In the move
package we implemented separate objects for one single individuals
(Move
) and multiple individuals (MoveStack
). Here we choose to not do this.
This reduces complexity. If functions require single individuals to work it is
easy enough to split these of.
Tracking data generally consists of a time series of observations from a range of "sensors". Each of these observation or events at least have a time and a sensor associated with them. Some have a location recorded by, for example, a gps sensor other have non locations data like acceleration or gyroscope measurements. All events are combined in one large dataset, this facilitates combined analysis between them (e.g. interpolation to the position of an acceleration measurement). However for some analysis specific sensors or data types will be needed therefore filtering functions are available that subset the data to, for example, all location data.
To facilitate working with the trajectories we distinguish between track attributes and event attributes. Track level data could be individual and species names, sex and age. This can furthermore greatly facilitate object sizes as that is not duplicated. Keeping track attributes separate also contributes to data integrity as ensures track level attributes are consistent within a track.
In this section we go through the attributes that move2
uses.
time_column
This attributes should contain a string with a length of 1
. This string
indicates in which column the timestamp information of the locations in it. The
string should thus be an existing column. The time column in most cases will
contain timestamps in the POSIXct
format. In some cases timestamps will not be
referring to an exact time point. For example when simulating movement data or
analysis from a video. In these cases times can also be stored as integer
or
numeric
values.
track_id_column
This attribute should contain a string of length 1
. A column with this name
should be contained both in the track_data
attribute and in the main dataset.
This column also functions as the link between the track_data
and the main
data, linking the individual attributes to the individual data.
track_data
This dataset contains the track level data. Properties of the individual follows (e.g. sex, age and name) can be stored here. Additionally other deployment level information can be contained. As the move2 package does not separate individuals, tags and deployments. All information from these 3 entities in movebank are combined here.
time_column
Using the time_column
attribute this column can be identified, for quick
retrieval there is the mt_time
function. Values should be either timestamps
(e.g. POSIXct
, Date
) or numeric
. Numeric values are facilitated as it can
be useful for simulation, videos and laboratory experiments were absolute time
reference is not available or relevant.
track_id_column
This column is identified by the track_id_column
attributes, values can either
be a character
, factor
or integer
like values. For retrieval there is the
mt_track_id
function.
In move
relatively stringent quality checking was done on the object. This
enforced certain attributes for a trajectory that are sensible but in practice
are not always adhered to. Some of these properties are:
Every record had a valid location (except for unUsedRecords
but those were
rarely used)
Records were time ordered within individual
All individuals were ordered
Timestamps could not be duplicated.
Even though these are some useful properties for subsequent work when reading
not all data adheres to these standards. To solve this there were options to
remove duplicated records but these simply took the first record. Here we take a
more permissive approach where less stringent checking is done on the input
side. This means functions working with move2
need to ensure input data
adheres to their expectations. To facilitate that several assertion functions
are provided that can quickly check data. Taking this approach gives the users
more flexibility in resolving inconsistencies within R. We provide several
functions to make this work quick. For specific use cases more informed
functions can be developed.
If you are writing functions based on the move2
package and your function
assumes a specific data structure this can best be checked with assert_that
in
combination with one of the assertion functions. This construct results in
informative error messages:
data <- mt_sim_brownian_motion(1:3)[c(1, 3, 2, 6, 4, 5), ] assert_that(mt_is_time_ordered(data))
To facilitate finding functions and assist in recognizably we use a prefix. For
functions relating to movement trajectories we use mt_
, similar to how the
sf
package uses st_
for spatial type. This prefix has the advantage of being
short compared to move_
. Functions for accessing data from
movebank use the prefix movebank_
. Furthermore do
all assertions functions start with either mt_is_
or mt_has_
.
When analyzing trajectories frequently metrics are calculated that are
properties of the time period in between two observations. Prime examples are
the distance and speed between locations. This means that for each track with a
length of $n$ locations there are $n-1$ measurements. To facilitate storing and
processing this data we pad each track with a NA
value at the end. This
ensured that return vectors from functions like mt_distance
, mt_speed
and
mt_azimuth
return vectors with the same length of as the number of rows in the
move2
object. If the return values from these kind of functions are assigned
to the move2
object the properties stored in the first row reflect the value
for the interval between the first and second row.
Some metrics are calculated as a function of the segment before and after a
segment (e.g. turn angles). In these cases the return vectors still have the
same length however they are padded by a NA
value at the beginning and end of
each track so that the metric is stored with the location it is representative
for.
Data sets have been growing considerably over the past decade since move
was
written. The ambition with move2
is to facilitate this trend. It should work
smoothly with trajectories of more then a million records. We have successfully
loaded up to 30 million events into R, however at some stage memory limitations
of the host computer start being a concern. This can to some extent be
alleviated by omitting unnecessary columns from the data set, either at download
or when reading the data. An alternative approach would be to facilitate working
with trajectories on disk or within a database (alike dbplyr
). However since
many functions and packages we rely on do not support this, we opt not to do
this. Therefore, if reducing the data loaded does not solve the problem, it can
be advisable to use a computer with more memory or when possible split up
analysis per track.
Here we first a quick overview of the most important function.
move2
objectsf::st_coordinates()
: returns the coordinates from the the events in the
track(s)
sf::st_crs()
: returns the projection of the tracks(s)
sf::st_bbox()
: returns the bounding box of the track(s)
mt_time()
: returns the timestamps for each event in the track
mt_track_data()
: returns the table containing the information associated to
the tracks
mt_track_id()
: returns a vector of the track id associated to each event
unique(mt_track_id())
: returns the names of the tracks
mt_n_tracks()
: returns the number of the tracks
nrow()
: returns the total number of events
table(mt_track_id())
: returns the number of events per track
mt_time_column()
: returns the name of the column containing the timestamps
used by the move2
object
mt_track_id_column()
: returns the name of the column containing the track
ids used by the move2
object
move2
objectmt_as_move2()
: creates a move2
object from objects of class sf
,
data.frame
, telemetry
/telemetry list
from ctmm, track_xyt
from amt
or Move
/MoveStack
from move. move2
object into other classesto_move()
: converts to a object of class Move
/MoveStack
x2 <- x; class(x2) <- class(x) %>% setdiff("move2")
: to remove move2
class from the object, it will
be recognized as an object of class sf
to transform into a flat table without loosing information:
x <- mt_as_event_attribute(x, names(mt_track_data(x)))
x <- dplyr::mutate(x, coords_x=sf::st_coordinates(x)[,1], coords_y=sf::st_coordinates(x)[,2])
x <- sf::st_drop_geometry(x)
mt_read()
: read in data downloaded from movebank, by just stating the path to the file
mt_read(mt_example())
: example dataset
dplyr::filter(x, !sf::st_is_empty(x))
: exclude all empty locations
filter_track_data(x, .track_id = c("nameTrack1", "nameTrack3")
: subset to one or more tracks
split(x, mt_track_id(x))
: split a move2
object into a list of single
objects per track. Alternatively see dplyr::mutate()
, dplyr::group_by()
,
group_by_track_data()
to apply calculations to tracks separately
mt_stack()
: combine multiple move2
objects into one
mt_as_track_attribute()
/mt_as_event_attribute()
: move columns between
track and event attributes (and vice versa)
mt_set_track_id()
: replace track ids with new values, set new column to
define tracks or rename track id column
mutate_track_data()
: add or modify attributes in the track data
sf::st_transform()
: to reproject the move2
into a different projection
mt_aeqd_crs()
: create a AEQD coordinate reference system
mt_track_lines()
: convert a trajectory into lines for plotting with e.g.
ggplot
use the argument max.plot = 1
to display a single plot of the track.
The attribute that should be used to color the tracks can be specified, e.g.
plot(x["individual_local_identifier"], max.plot = 1)
.
Here is more information
on how to do simple plots.
All functions of the move2
package are described
here.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.