flag_duplicates: Flag Low Quality Duplicates

View source: R/cleaning.R

flag_duplicatesR Documentation

Flag Low Quality Duplicates


Flags locations with duplicate timestamps by DOP and distance


flag_duplicates(x, gamma, time_unit = "mins", DOP = "dop", ...)

## S3 method for class 'track_xyt'
flag_duplicates(x, gamma, time_unit = "mins", DOP = "dop", ...)



⁠[track_xyt]⁠ A track_xyt object.


⁠[numeric or Period]⁠ The temporal tolerance defining duplicates. See details below. If numeric, its units are defined by time_unit. If Period, time_unit is ignored.


⁠[character]⁠ Character string giving time unit for gamma. Should be "secs", "mins", or "hours". Ignored if ⁠class(gamma) == "Period".⁠


⁠[character]⁠ A character string giving the name of the column containing the dilution of precision (DOP) data. See details below.


Addtional arguments. None currently implemented.


Locations are considered duplicates if their timestamps are within gamma of each other. However, the function runs sequentially through the track object, so that only timestamps after the focal point are flagged as duplicates (and thus removed from further consideration). E.g., if gamma = minutes(5), then all locations with timestamp within 5 minutes after the focal location will be considered duplicates.

When duplicates are found, (1) the location with the lowest dilution of precision (given by DOP column) is kept. If there are multiple duplicates with equally low DOP, then (2) the one closest in space to previous location is kept. In the event of exact ties in DOP and distance, (3) the first location is kept. This is unlikely unless there are exact coordinate duplicates.

In the case that the first location in a trajectory has a duplicate, there is no previous location with which to calculate a distance. In that case, the algorithm skips to (3) and keeps the first location.

In the event your data.frame does not have a DOP column, you can insert a dummy with constant values such that all duplicates will tie, and distance will be the only criterion (e.g., x$dop <- 1). In the event you do have an alternate measure of precision where larger numbers are more precise (e.g., number of satellites), simply multiply that metric by -1 and pass it as if it were DOP.

Internally, the function drops duplicates as it works sequentially through the data.frame. E.g., if location 5 was considered a duplicate of location 4 – and location 4 was higher quality – then location 5 would be dropped. The function would then move on to location 6 (since 5 was already dropped). However, the object returned to the user has all the original rows of x – i.e., locations are flagged rather than removed.


Returns x (a track_xyt) with a flagging column added (x$duplicate_).


Brian J. Smith, based on code by Johannes Signer and Tal Avgar

See Also

flag_fast_steps(), flag_roundtrips(), flag_defunct_clusters()

amt documentation built on June 25, 2024, 1:14 a.m.