The trackeR package provides infrastructure for handling running and cycling data from GPS-enabled tracking devices. A short demonstration of its functionality is provided below, based on data from running activities. A more comprehensive introduction to the package can be found in the vignette "Infrastructure for Running and Cycling Data", which can be accessed by typing
vignette("trackeR", package = "trackeR")
knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 7 )
trackeR can currently import files in the Training Centre XML (TCX) format and .db3 files (SQLite databases, used, for example, by devices from GPSports) through the corresponding functions readTCX() and readDB3(). It also offers support for JSON files from Golden Cheetah via readJSON().
library("trackeR") filepath <- system.file("extdata/tcx/", "2013-06-01-183220.TCX.gz", package = "trackeR") runDF <- readTCX(file = filepath, timezone = "GMT")
These read functions return a data.frame
of the following structure
str(runDF)
That data.frame
can be used as an input to the constructor function for trackeR's
trackeRdata class, to produce a session-based and unit-aware object that can be used
for further analyses.
runTr0 <- trackeRdata(runDF)
The read_container() function combines the two steps of importing the data and constructing the trackeRdata object.
runTr1 <- read_container(filepath, type = "tcx", timezone = "GMT") identical(runTr0, runTr1)
The read_directory() function can be used to read all
supported files in a directory and
produce the corresponding trackeRdata objects.
The package includes an example data set which can be accessed through
data("runs", package = "trackeR")
The default behaviour of the plot method for trackeRdata objects is to show how heart rate and pace evolve over the session.
plot(runs, session = 1:7)
The elevation profile of a training session is also accessible, here along with the pace.
plot(runs, session = 8, what = c("altitude", "pace"))
The route taken during a training session can also be plotted on maps from various sources e.g., from Google or OpenStreetMap. This can be done either on a static map
tryCatch(plot_route(runs, session = 1, source = "stamen"), error = function(x) "Failed to donwload map data")
or on an interactive map.
tryCatch(leaflet_route(runs, session = c(1, 6, 12)), error = function(x) "Failed to donwload map data")
The summary of sessions includes basic statistics like duration, time spent moving, average speed, pace, and heart rate. The speed threshold used to distinguish moving from resting can be set by the argument moving_threshold.
summary(runs, session = 1, moving_threshold = c(cycling = 2, running = 1, swimming = 0.5))
It is usually desirable to visualise summaries from multiple sessions. This can be done using the plot method for summary objects. Below, we produce such a plot for average heart rate, average speed, distance, and duration.
runs_summary <- summary(runs) plot(runs_summary, group = c("total", "moving"), what = c("avgSpeed", "distance", "duration", "avgHeartRate"))
The timeline plot is useful to visualise the date and time that the sessions took place and provide information of their relative duration.
timeline(runs_summary)
The time spent training in certain zones, e.g., speed zones, can also be calculated and visualised.
run_zones <- zones(runs[1:4], what = "speed", breaks = c(0, 2:6, 12.5)) plot(run_zones)
trackeR can also be used to calculate and visualise the work capacity W' (pronounced
as W prime
). The comprehensive vignette "Infrastructure for Running and Cycling Data"
provides the definition of work capacity and details on the version and quantity arguments.
wexp <- Wprime(runs, session = 11, quantity = "expended", cp = 4, version = "2012") plot(wexp, scaled = TRUE)
@trackeR:Kosmidis+Passfield:2015 introduce the concept of distribution and concentration profiles for which trackeR provides an implementation. These profiles are motivated by the need to compare sessions and use information on such variables as heart rate or speed during a session for further modelling.
The distribution profile for a variable such as speed or heart rate describes the time exercising above a (speed or heart rate) threshold.
Here, the distribution profiles for the first 4 sessions are calculated for speed with thresholds ranging from 0 to 12.5 m/s in increments of 0.05 m/s.
d_profile <- distribution_profile(runs, session = 1:4, what = "speed", grid = list(speed = seq(0, 12.5, by = 0.05))) plot(d_profile, multiple = TRUE)
Sessions 4 and 1 are longer than session 2 and 3, as visible by the higher amount of time spent exercising above 0 m/s. Sessions 3 and 4 show a larger amount of time spent exercising above 4 m/s than the other sessions. This is easier to spot in the concentration profiles which are the negative derivative of the distribution profiles. The concentration profile for session 3 has a mode at around 3.5 meters per second and another one above 4 meters per second, showing that this session involved training at a combination of low and high speeds.
c_profile <- concentrationProfile(d_profile, what = "speed") plot(c_profile, multiple = TRUE, smooth = TRUE)
More details on distribution and concentration profiles can be found in the comprehensive vignette "Infrastructure for Running and Cycling Data".
The distribution and concentration can be used for further analysis such as a functional principal components analysis (PCA) to describe the differences between the profiles.
The concentration profiles for all session
runsT <- threshold(runs) dp_runs <- distribution_profile(runsT, what = "speed") dp_runs_S <- smoother(dp_runs) cp_runs <- concentration_profile(dp_runs_S) plot(cp_runs, multiple = TRUE, smooth = FALSE)
vary in their shape (unimodal or multimodal), height, and location (revealing concentrations at higher or lower speeds). The function funPCA() can be used to fit a functional PCA, here with 4 principal components.
cpPCA <- funPCA(cp_runs, what = "speed", nharm = 4)
For the speed concentration profiles here, the first two components cover
r round(sum(cpPCA$varprop[1:2]*100))
% of the variability in the profiles.
round(cpPCA$varprop, 2) plot(cpPCA, harm = 1:2)
A plot of the first two principal components reveals that the profiles here differ mostly in the general height of the curves and the location. These two aspects of the profiles can be captured by two univariate measures of the sessions, the time spent moving and the average speed moving.
## plot scores vs summary statistics scoresSP <- data.frame(cpPCA$scores) names(scoresSP) <- paste0("speed_pc", 1:4) d <- cbind(runs_summary, scoresSP) library("ggplot2") ## pc1 ~ session duration (moving) ggplot(d) + geom_point(aes(x = as.numeric(durationMoving), y = speed_pc1)) + theme_bw() ## pc2 ~ avg speed (moving) ggplot(d) + geom_point(aes(x = avgSpeedMoving, y = speed_pc2)) + theme_bw()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.