Missing observations and recording errors are fairly common in tracking data.
They can be caused by hardware failures, object occultation, faulty data writing,
etc. trackdf
provides a few functions to help detect this missing or erroneous
data so that you can fix them or omit them altogether from your analysis.
But first, let's load some "flawed" data provided with trackdf
:
library(trackdf) raw <- read.csv(system.file("extdata/gps/01.csv", package = "trackdf")) tt <- track(x = raw$lon, y = raw$lat, t = paste(raw$date, raw$time), id = 1, proj = "+proj=longlat", tz = "Africa/Windhoek") print(tt, max = 10 * ncol(tt))
These are observations that have not been recorded at all. If the data is
recorded at regular intervals, then these missing observations can be easily
detected using the missing_data
function as follows:
missing <- missing_data(tt) missing
The output is a track table with each row corresponding to a time stamp at which at least one coordinate is missing.
Note that you can specify the beginning (begin
) and end (end
) of the
observation window in which you want to detect missing data, as well as the time
difference (step
) between successive observations.
These are observations that are repeated multiple times throughout the data set
(e.g., two observations with identical time stamps for a given individual).
These duplicated observations can be detected using the duplicated_data
function as follows:
dups <- duplicated_data(tt, type = "t") dups
The output is a track table with each row corresponding to an observation that
was partially or completely duplicated, depending on the type
argument. This
argument is a character string or a vector of character strings indicating the
type of duplications to look for. The strings can be any combination of "t" (for
time duplications) and "x", "y", "z" (for coordinate duplications). For instance,
the string "txy" will return data with duplicated time stamps and duplicated x
and y coordinates.
These are observations whose coordinates are too different from the surrounding
(timewise) observations, for instance, because of sporadic errors in GPS
recordings. These inconsistent observations can be detected using the
duplicated_data
function as follows:
inconsistent <- inconsistent_data(tt, s = 15) inconsistent
The output is a track table with each row corresponding to an inconsistent observation.
Note that the detection of inconsistencies requires specifying a threshold (s
)
for distinguishing between consistent and inconsistent observations. Higher
threshold values will result in a lower number of detected inconsistencies, and
reciprocally for lower threshold values.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.