knitr::opts_chunk$set( collapse = TRUE, comment = "#>", out.width = "100%", echo = TRUE, cache = FALSE, message = FALSE )
The arctools
package allows to generate summaries of the minute-level physical activity (PA) data. The default parameters are chosen for the Actigraph activity counts collected with a wrist-worn device; however, the package can be used for other minute-level PA data with the corresponding timepstamps vector.
Below, we demonstrate the use of arctools
with the attached, exemplary minute-level Actigraph PA counts data.
arctools
package to compute physical activity summariesThe arctools
functions process one file with accelerometry data at a time.
Four CSV data sets with minute-level activity counts data are attached to the arctools
package. The data file names are stored in extdata_fnames
object that becomes available once the arctools
package is loaded.
Below, we defined fpath
to be a path to one of the minute-level activity counts data files. fread()
reads minute-level activity counts data file while conveniently skipping first few rows with meta data, and then as.data.frame()
converts the read data into a data frame object. The read-in data is assigned to dat
variable. head()
and tail()
get first few and last few rows of dat
, respectively.
library(arctools) library(data.table) library(dplyr) library(ggplot2) library(lubridate) ## Read one of the data sets fpath <- system.file("extdata", extdata_fnames[1], package = "arctools") dat <- as.data.frame(fread(fpath)) rbind(head(dat, 3), tail(dat, 3))
The data columns are:
Axis1
- sensor's X axis minute-level counts data,Axis2
- sensor's Y axis minute-level counts data,Axis3
- sensor's Z axis minute-level counts data,vectormagnitude
- minute-level counts data defined as sqrt(Axis1^2 + Axis2^2 + Axis3^2)
,timestamp
- time-stamps corresponding to minute-level measures. ## Plot activity counts ## Format timestamp data column from character to POSIXct object ggplot(dat, aes(x = ymd_hms(timestamp), y = vectormagnitude)) + geom_line(size = 0.3, alpha = 0.8) + labs(x = "Time", y = "Activity counts") + theme_gray(base_size = 10) + scale_x_datetime(date_breaks = "1 day", date_labels = "%b %d")
activity_stats
methodacc <- dat$vectormagnitude acc_ts <- ymd_hms(dat$timestamp) activity_stats(acc, acc_ts)
To explain activity_stats
method output, we first define the terms activity count, active/non-active minute, active/non-active bout, and valid day.
?activity_stats
). Meta information:
n_days
- number of days (unique day dates) of data collection.n_valid_days
- number of days (unique day dates) of data collection determined as valid days. wear_time_on_valid_days
- average number of wear-time minutes across valid days.Summaries of PA volumes metrics:
tac
- TAC, Total activity counts per day - sum of AC measured on valid days divided by the number of valid days.tlac
- TLAC, Total-log activity counts per day - sum of log(1+AC) measured on valid days divided by the number of valid days. Here 'log' denotes the natural logarithm.ltac
- LTAC, Log-total activity counts - natural logarithm of TAC.time_spent_active
- Average number of active minutes per valid day.time_spent_nonactive
- Average number of sedentary minutes per valid day.Summaries of PA fragmentation metrics:
astp
- ASTP, active to sedentary transition probability on valid days. satp
- SATP, sedentary to active transition probability on valid days. no_of_active_bouts
- Average number of active minutes per valid day.no_of_nonactive_bouts
- Average number of sedentary minutes per valid day.mean_active_bout
- Average duration (in minutes) of an active bout on valid days.mean_nonactive_bout
- Average duration (in minutes) of a sedentary bout on valid days.activity_stats
method optionsThe subset_minutes
argument allows to specify a subset of a day's minutes where activity summaries should be computed. There are 1440 minutes in a 24-hour day where 1
denotes 1st minute of the day (from 00:00 to 00:01), and 1440
denotes the last minute (from 23:59 to 00:00).
Here, we summarize PA observed between 12:00 AM and 6:00 AM.
subset_12am_6am <- 1 : (6 * 1440/24) activity_stats(acc, acc_ts, subset_minutes = subset_12am_6am)
By default, column names have a suffix added to denote that a subset of minutes was used (here, _0to6only
). This can be disabled by setting adjust_out_colnames
to FALSE
.
subset_12am_6am = 1 : (6/24 * 1440) subset_6am_12pm = (6/24 * 1440 + 1) : (12/24 * 1440) subset_12pm_6pm = (12/24 * 1440 + 1) : (18/24 * 1440) subset_6pm_12am = (18/24 * 1440 + 1) : (24/24 * 1440) out <- rbind( activity_stats(acc, acc_ts, subset_minutes = subset_12am_6am, adjust_out_colnames = FALSE), activity_stats(acc, acc_ts, subset_minutes = subset_6am_12pm, adjust_out_colnames = FALSE), activity_stats(acc, acc_ts, subset_minutes = subset_12pm_6pm, adjust_out_colnames = FALSE), activity_stats(acc, acc_ts, subset_minutes = subset_6pm_12am, adjust_out_colnames = FALSE)) rownames(out) <- c("12am-6am", "6am-12pm", "12pm-6pm", "6pm-12am") out
The subset_weekdays
argument allows to specify days of a week within which activity summaries are to be computed; it takes values between 1 (Sunday) to 7 (Saturday). Default is NULL
(all days of a week are used).
Here, we summarize PA within weekday days only. Note that in the method output, the n_days
and n_valid_days
columns only count the days from the selected week days subset; for example, below, n_days
number of unique day dates in data is 6 despite the range of data collection without subsetting ranges 8 days.
# day of a week indices 2,3,4,5,6 correspond to Mon,Tue,Wed,Thu,Fri subset_weekdays <- c(2:6) activity_stats(acc, acc_ts, subset_weekdays = subset_weekdays)
Note the subset_weekdays
argument can be combined with other arguments, i.e. subset_minutes
to subset of a day's minutes where activity summaries should be computed.
# day of a week indices 7,1 correspond to Sat,Sun subset_weekdays <- c(7,1) activity_stats(acc, acc_ts, subset_weekdays = subset_weekdays, subset_minutes = subset_6am_12pm)
The exclude_minutes
argument allows specifying a subset of a day's minutes excluded for computing activity summaries.
Here, we summarize PA while excluding observations between 11:00 PM and 5:00 AM.
subset_11pm_5am <- c( (23 * 1440/24 + 1) : 1440, ## 11:00 PM - midnight 1 : (5 * 1440/24) ## midnight - 5:00 AM ) activity_stats(acc, acc_ts, exclude_minutes = subset_11pm_5am)
The in_bed_time
and out_bed_time
arguments allow to provide day-specific in-bed periods to be excluded from analysis.
Here, we summarize PA excluding in-bed time estimated by ActiLife software.
The ActiLife-estimated in-bed data file is attached to the arctools
package. The sleep data columns include:
Subject Name
- subject IDs corresponding to AC data, stored in extdata_fnames
,In Bed Time
- ActiLife-estimated start of in-bed interval for each day of the measurement, Out Bed Time
- ActiLife-estimated end of in-bed interval. ## Read sleep details data file SleepDetails_fname <- "BatchSleepExportDetails_2020-05-01_14-00-46.csv" SleepDetails_fpath <- system.file("extdata", SleepDetails_fname, package = "arctools") SleepDetails <- as.data.frame(fread(SleepDetails_fpath)) ## Filter sleep details data to keep ID1 file SleepDetails_sub <- SleepDetails %>% filter(`Subject Name` == "ID_1") %>% select(`Subject Name`, `In Bed Time`, `Out Bed Time`) str(SleepDetails_sub)
We transform dates stored as character into POSIXct
object, and then use in/out-bed dates vectors in activity_stats
method.
in_bed_time <- mdy_hms(SleepDetails_sub[, "In Bed Time"]) out_bed_time <- mdy_hms(SleepDetails_sub[, "Out Bed Time"]) activity_stats(acc, acc_ts, in_bed_time = in_bed_time, out_bed_time = out_bed_time)
activity_stats
methodThe primary method activity_stats
is composed of several steps implemented in their respective functions. Below, we demonstrate how to produce activity_stats
results step by step with these functions.
We reuse the objects:
acc
- a numeric vector; minute-level activity counts data,acc_ts
- a POSIXct
vector; minute-level time of acc
data collection. df <- data.frame(acc = acc, acc_ts = acc_ts) rbind(head(df, 3), tail(df, 3))
midnight_to_midnight
00:00-00:01
on the first day of data collection, and the last observation corresponds to the minute of 23:50-00:00
on the last day of data collection. NA
.Here, collected data cover total of 7*24*1440 = 10080
minutes (from 2018-07-13 10:00:00
to 2018-07-20 09:59:00
), but spans 8*24*1440 = 11520
minutes of full midnight-to-midnight days (from 2018-07-13 00:00:00
to 2018-07-20 23:59:00
).
acc <- midnight_to_midnight(acc = acc, acc_ts = acc_ts) ## Vector length on non NA-obs, vector length after acc c(length(acc[!is.na(acc)]), length(acc))
get_wear_flag
Function get_wear_flag
computes wear/non-wear flag (1/0
) for each minute of activity counts data. Method implements wear/non-wear detection algorithm closely following that of Choi et al. (2011). See ?get_wear_flag
for more details and function arguments.
1
for wear and 0
for non-wear flagged minute.NA
entry in a data input vector, then the returned vector will have a corresponding entry set to NA
too.wear_flag <- get_wear_flag(acc) ## Proportion of wear time across the days wear_flag_mat <- matrix(wear_flag, ncol = 1440, byrow = TRUE) round(apply(wear_flag_mat, 1, sum, na.rm = TRUE) / 1440, 3)
get_valid_day_flag
Function get_valid_day_flag
computes valid/non-valid day flag (1/0
) for each minute of activity counts data. See ?get_valid_day_flag
for more details and function arguments.
Here, 4 out of 8 days have more than 10% (144 minutes) of missing data.
valid_day_flag <- get_valid_day_flag(wear_flag) ## Compute number of valid days valid_day_flag_mat <- matrix(valid_day_flag, ncol = 1440, byrow = TRUE) apply(valid_day_flag_mat, 1, mean, na.rm = TRUE)
impute_missing_data
Function impute_missing_data
imputes missing data in valid days based on the "average day profile", a minute-wise average of wear-time AC across valid days. See ?get_valid_day_flag
for more details and function arguments.
## Copies of original objects for the purpose of demonstration acc_cpy <- acc wear_flag_cpy <- wear_flag ## Artificially replace 1h (4%) of a valid day with non-wear repl_idx <- seq(from = 1441, by = 1, length.out = 60) acc_cpy[repl_idx] <- 0 wear_flag_cpy[repl_idx] <- 0 ## Impute data for minutes identified as non-wear in days identified as valid acc_cpy_imputed <- impute_missing_data(acc_cpy, wear_flag_cpy, valid_day_flag) ## Compare mean activity count on valid days before and after imputation c(mean(acc_cpy[which(valid_day_flag == 1)]), mean(acc_cpy_imputed[which(valid_day_flag == 1)]))
summarize_PA
Finally, method summarize_PA
computes PA summaries. Similarly as activity_stats
, it accepts arguments to subset/exclude minutes. See ?activity_stats
for more details and function arguments.
summarize_PA(acc, acc_ts, wear_flag, valid_day_flag)
It returns the same results as the activity_stats
function:
activity_stats(dat$vectormagnitude, ymd_hms(dat$timestamp))
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.