knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%", echo = TRUE, cache = FALSE, message = FALSE )
The arcstats package allows to generate summaries of the minute-level physical activity (PA) data. The default parameters are chosen for the Actigraph activity counts collected with a wrist-worn device; however, the package can be used for all minute-level PA data with the corresponding timestamps vector.
Below, we demonstrate the use of arcstats with the attached, exemplary minute-level Actigraph PA counts data.
You can install the released version of arcstats from GitHub. Note you may need to install devtools package if not yet installed (the line commented below).
# install.packages("devtools") devtools::install_github("martakarass/arcstats")
A PDF with detailed documentation of all methods can be accessed here.
arcstats package to compute physical activity summariesFour CSV data sets with minute-level activity counts data are attached to the arcstats package. The data file names are stored in extdata_fnames.
library(arcstats) library(data.table) library(dplyr) library(lubridate) library(ggplot2) ## Read one of the data sets fpath <- system.file("extdata", extdata_fnames[1], package = "arcstats") dat <- as.data.frame(fread(fpath)) rbind(head(dat, 3), tail(dat, 3))
The data columns are:
Axis1 - sensor's X axis minute-level counts data,Axis2 - sensor's Y axis minute-level counts data,Axis3 - sensor's Z axis minute-level counts data,vectormagnitude - minute-level counts data defined as sqrt(Axis1^2 + Axis2^2 + Axis3^2),timestamp - time-stamps corresponding to minute-level measures. ## Plot activity counts ggplot(dat, aes(x = ymd_hms(timestamp), y = vectormagnitude)) + geom_line(size = 0.3, alpha = 0.8) + labs(x = "Time", y = "Activity counts") + theme_gray(base_size = 10) + scale_x_datetime(date_breaks = "1 day", date_labels = "%b %d")
activity_stats methodacc <- dat$vectormagnitude acc_ts <- ymd_hms(dat$timestamp) activity_stats(acc, acc_ts)
library(knitr) library(kableExtra) tbl_styl_f <- function(x) kable_styling(x, font_size = 14)
out <- activity_stats(acc, acc_ts) # tbl_1 <- kable(out[, 1:6], table.attr = 'class="myTable"') %>% tbl_styl_f(.) tbl_1 <- kable(out[, 1:6]) %>% tbl_styl_f(.) tbl_2 <- kable(out[, 7:10]) %>% tbl_styl_f(.) tbl_3 <- kable(out[, 11:14]) %>% tbl_styl_f(.) tbl_1; tbl_2; tbl_3
To explain activity_stats method output, we define activity count, active/non-active minute, and active/non-active bout.
?get_wear_flag for wear/non-wear time detection details). Meta information:
n_days - number of days (unique day dates) of data collection.n_valid_days - number of days (unique day dates) of data collection determined as valid days. wear_time_on_valid_days - average number of wear-time minutes across valid days.Summaries of PA volumes metrics:
tac - TAC, Total activity counts per day - sum of AC measured on valid days divided by the number of valid days.tlac - TLAC, Total-log activity counts per day - sum of log(1+AC) measured on valid days divided by the number of valid days. Here 'log' denotes the natural logarithm.ltac - LTAC, Log-total activity counts - natural logarithm of TAC.time_spent_active - Average number of active minutes per valid day.time_spent_nonactive - Average number of sedentary minutes per valid day.Summaries of PA fragmentation metrics:
astp - ASTP, active to sedentary transition probability on valid days. satp - SATP, sedentary to active transition probability on valid days. no_of_active_bouts - Average number of active minutes per valid day.no_of_nonactive_bouts - Average number of sedentary minutes per valid day.mean_active_bout - Average duration (in minutes) of an active bout on valid days.mean_nonactive_bout - Average duration (in minutes) of a sedentary bout on valid days.activity_stats method optionsThe subset_minutes argument allows to specify subset of a day's minutes where activity summaries should be computed. There are 1440 minutes in a 24-hour day where 1 denotes 1st minute of the day (from 00:00 to 00:01), and 1440 denotes the last minute (from 23:59 to 00:00).
Here, we summarize PA observed between 12:00 AM and 6:00 AM.
subset_12am_6am <- 1 : (6 * 1440/24) activity_stats(acc, acc_ts, subset_minutes = subset_12am_6am)
out <- activity_stats(acc, acc_ts, subset_minutes = subset_12am_6am) tbl_1 <- kable(out[, 1:6]) %>% tbl_styl_f(.) tbl_2 <- kable(out[, 7:10]) %>% tbl_styl_f(.) tbl_3 <- kable(out[, 11:14]) %>% tbl_styl_f(.) tbl_1; tbl_2; tbl_3
By default, column names have a suffix added to denote that a subset of minutes was used (here, _0to6only). This can be disabled by setting adjust_out_colnames to FALSE.
subset_12am_6am = 1 : (6/24 * 1440) subset_6am_12pm = (6/24 * 1440 + 1) : (12/24 * 1440) subset_12pm_6pm = (12/24 * 1440 + 1) : (18/24 * 1440) subset_6pm_12am = (18/24 * 1440 + 1) : (24/24 * 1440) out <- rbind( activity_stats(acc, acc_ts, subset_minutes = subset_12am_6am, adjust_out_colnames = FALSE), activity_stats(acc, acc_ts, subset_minutes = subset_6am_12pm, adjust_out_colnames = FALSE), activity_stats(acc, acc_ts, subset_minutes = subset_12pm_6pm, adjust_out_colnames = FALSE), activity_stats(acc, acc_ts, subset_minutes = subset_6pm_12am, adjust_out_colnames = FALSE)) rownames(out) <- c("12am-6am", "6am-12pm", "12pm-6pm", "6pm-12am") out
tbl_1 <- kable(out[, 1:6]) %>% tbl_styl_f(.) tbl_2 <- kable(out[, 7:10]) %>% tbl_styl_f(.) tbl_3 <- kable(out[, 11:14]) %>% tbl_styl_f(.) tbl_1; tbl_2; tbl_3
The exclude_minutes argument allows to specify subset of a day's minutes excluded for computing activity summaries.
Here, we summarize PA while excluding observations between 11:00 PM and 5:00 AM.
subset_11pm_5am <- c( (23 * 1440/24 + 1) : 1440, ## 11:00 PM - midnight 1 : (5 * 1440/24) ## midnight - 5:00 AM ) activity_stats(acc, acc_ts, exclude_minutes = subset_11pm_5am)
out <- activity_stats(acc, acc_ts, exclude_minutes = subset_11pm_5am) tbl_1 <- kable(out[, 1:6]) %>% tbl_styl_f(.) tbl_2 <- kable(out[, 7:10]) %>% tbl_styl_f(.) tbl_3 <- kable(out[, 11:14]) %>% tbl_styl_f(.) tbl_1; tbl_2; tbl_3
The in_bed_time and out_bed_time arguments allow to provide day-specific in-bed periods to be excluded from analysis.
Here, we summarize PA excluding in-bed time estimated by ActiLife software.
The ActiLife-estimated in-bed data file is attached to the arcstats package. The sleep data columns relevant are,
Subject Name - subject IDs coressponding to AC data, stored in extdata_fnames,In Bed Time - ActiLife-estimated start of in-bed interval for each day of the measurement, Out Bed Time - ActiLife-estimated end of in-bed interval. ## Read sleep details data file SleepDetails_fname <- "BatchSleepExportDetails_2020-05-01_14-00-46.csv" SleepDetails_fpath <- system.file("extdata", SleepDetails_fname, package = "arcstats") SleepDetails <- as.data.frame(fread(SleepDetails_fpath)) ## Filter sleep details data to keep data correcponding to current counts data file SleepDetails_sub <- SleepDetails %>% filter(`Subject Name` == "ID_1") %>% mutate(`In Bed Time` = mdy_hms(`In Bed Time`), `Out Bed Time` = mdy_hms(`Out Bed Time`)) %>% select(`Subject Name`, `In Bed Time`, `Out Bed Time`) SleepDetails_sub
Finally, we use in/out-bed time POSIXct vectors in activity_stats method.
in_bed_time <- SleepDetails_sub[, "In Bed Time"] out_bed_time <- SleepDetails_sub[, "Out Bed Time"] activity_stats(acc, acc_ts, in_bed_time = in_bed_time, out_bed_time = out_bed_time)
out <- activity_stats(acc, acc_ts, in_bed_time = in_bed_time, out_bed_time = out_bed_time) tbl_1 <- kable(out[, 1:6]) %>% tbl_styl_f(.) tbl_2 <- kable(out[, 7:10]) %>% tbl_styl_f(.) tbl_3 <- kable(out[, 11:14]) %>% tbl_styl_f(.) tbl_1; tbl_2; tbl_3
activity_stats methodThe primary method activity_stats is composed of several steps implemented in their respective functions. Below, we demonstrate how to produce activity_stats results step by step with these functions.
We reuse the objects:
acc - numeric vector; minute-level activity counts data,acc_ts - POSIXct vector; minute-level time of acc data collection. df <- data.frame(acc = acc, acc_ts = acc_ts) rbind(head(df, 3), tail(df, 3))
midnight_to_midnight00:00-00:01 on the first day of data collection, and last observation corresponds to minute of 23:50-00:00 on the last day of data collection. NA.Here, collected data cover total of 7*24*1440 = 10080 minutes (from 2018-07-13 10:00:00 to 2018-07-20 09:59:00), but spans 8*24*1440 = 11520 minutes of full midnight-to-midnight days (from 2018-07-13 00:00:00 to 2018-07-20 23:59:00).
acc <- midnight_to_midnight(acc = acc, acc_ts = acc_ts) ## Vector length on non NA-obs, vector length after acc c(length(acc[!is.na(acc)]), length(acc))
get_wear_flagFunction get_wear_flag computes wear/non-wear flag (1/0) for each minute of activity counts data. Method implements wear/non-wear detection algorithm proposed by Choi et al. (2011).
1 for wear and 0 for non-wear flagged minute.NA entry in a data input vector, then the returned vector will have a corresponding entry set to NA too.wear_flag <- get_wear_flag(acc) ## Proportion of wear time across the days wear_flag_mat <- matrix(wear_flag, ncol = 1440, byrow = TRUE) round(apply(wear_flag_mat, 1, sum, na.rm = TRUE) / 1440, 3)
get_valid_day_flagFunction get_valid_day_flag computes valid/non-valid day flag (1/0) for each minute of activity counts data.
Here, 4 out of 8 days have more that 10% (144 minutes) of missing data.
valid_day_flag <- get_valid_day_flag(wear_flag) ## Compute number of valid days valid_day_flag_mat <- matrix(valid_day_flag, ncol = 1440, byrow = TRUE) apply(valid_day_flag_mat, 1, mean, na.rm = TRUE)
impute_missing_dataHere, all four valid days have 100% of wear time. We hence demonstrate the impute_missing_data method by artificially replacing 1h (4%) of a valid day with non-wear and running impute_missing_data function then.
Function impute_missing_data imputes missing data from the "average day profile". An "average day profile" is computed as a minute-wise average of AC across valid days.
## Copies of original objects for the purpose of demonstration acc_cpy <- acc wear_flag_cpy <- wear_flag ## Artificially replace 1h (4%) of a valid day with non-wear repl_idx <- seq(from = 1441, by = 1, length.out = 60) acc_cpy[repl_idx] <- 0 wear_flag_cpy[repl_idx] <- 0 ## Impute data for minutes identified as non-wear in days identified as valid acc_cpy_imputed <- impute_missing_data(acc_cpy, wear_flag_cpy, valid_day_flag) ## Compare mean activity count on valid days before and after imputation c(mean(acc_cpy[which(valid_day_flag == 1)]), mean(acc_cpy_imputed[which(valid_day_flag == 1)]))
summarize_PAFinally, method summarize_PA computes physical all PA characteristics.
activity_stats, it returns a data frame with physical activity summaries. activity_stats, it accepts arguments to compute physical activity summaries subset_minutes),exclude_minutes),in_bed_time, out_bed_time).
See ?summarize_PA for more details. Here, we summarize PA with the default parameters, across all valid days.
summarize_PA(acc, acc_ts, wear_flag, valid_day_flag)
out <- summarize_PA(acc, acc_ts, wear_flag, valid_day_flag) tbl_1 <- kable(out[, 1:6]) %>% tbl_styl_f(.) tbl_2 <- kable(out[, 7:10]) %>% tbl_styl_f(.) tbl_3 <- kable(out[, 11:14]) %>% tbl_styl_f(.) tbl_1; tbl_2; tbl_3
It returns the same results as the activity_stats function:
activity_stats(dat$vectormagnitude, ymd_hms(dat$timestamp))
out <- activity_stats(dat$vectormagnitude, ymd_hms(dat$timestamp)) tbl_1 <- kable(out[, 1:6]) %>% tbl_styl_f(.) tbl_2 <- kable(out[, 7:10]) %>% tbl_styl_f(.) tbl_3 <- kable(out[, 11:14]) %>% tbl_styl_f(.) tbl_1; tbl_2; tbl_3
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.