arcstats
package to compute physical activity
summaries
activity_stats
method
options
activity_stats
methodmidnight_to_midnight
get_wear_flag
get_valid_day_flag
impute_missing_data
summarize_PA
The arcstats
package allows to generate summaries of the minute-level
physical activity (PA) data. The default parameters are chosen for the
Actigraph activity counts collected with a wrist-worn device; however,
the package can be used for all minute-level PA data with the
corresponding timestamps vector.
Below, we demonstrate the use of arcstats
with the attached, exemplary
minute-level Actigraph PA counts data.
You can install the released version of arcstats from
GitHub. Note you may need to install devtools
package if not yet installed (the line commented below).
# install.packages("devtools")
devtools::install_github("martakarass/arcstats")
A PDF with detailed documentation of all methods can be accessed here.
arcstats
package to compute physical activity summariesFour CSV data sets with minute-level activity counts data are attached
to the arcstats
package. The data file names are stored in
extdata_fnames
.
library(arcstats)
library(data.table)
library(dplyr)
library(lubridate)
library(ggplot2)
## Read one of the data sets
fpath <- system.file("extdata", extdata_fnames[1], package = "arcstats")
dat <- as.data.frame(fread(fpath))
rbind(head(dat, 3), tail(dat, 3))
#> Axis1 Axis2 Axis3 vectormagnitude timestamp
#> 1 1021 1353 2170 2754 2018-07-13 10:00:00
#> 2 1656 1190 2212 3009 2018-07-13 10:01:00
#> 3 2540 1461 1957 3524 2018-07-13 10:02:00
#> 10078 0 0 0 0 2018-07-20 09:57:00
#> 10079 0 0 0 0 2018-07-20 09:58:00
#> 10080 0 0 0 0 2018-07-20 09:59:00
The data columns are:
Axis1
- sensor’s X axis minute-level counts data,Axis2
- sensor’s Y axis minute-level counts data,Axis3
- sensor’s Z axis minute-level counts data,vectormagnitude
- minute-level counts data defined as
sqrt(Axis1^2 + Axis2^2 + Axis3^2)
,timestamp
- time-stamps corresponding to minute-level measures.## Plot activity counts
ggplot(dat, aes(x = ymd_hms(timestamp), y = vectormagnitude)) +
geom_line(size = 0.3, alpha = 0.8) +
labs(x = "Time", y = "Activity counts") +
theme_gray(base_size = 10) +
scale_x_datetime(date_breaks = "1 day", date_labels = "%b %d")
activity_stats
methodacc <- dat$vectormagnitude
acc_ts <- ymd_hms(dat$timestamp)
activity_stats(acc, acc_ts)
n\_days
n\_valid\_days
wear\_time\_on\_valid\_days
tac
tlac
ltac
8
4
1440
2826648
6429.838
14.8546
astp
satp
time\_spent\_active
time\_spent\_nonactive
0.1781782
0.0951621
499.5
940.5
no\_of\_active\_bouts
no\_of\_nonactive\_bouts
mean\_active\_bout
mean\_nonactive\_bout
89
89.5
5.61236
10.50838
To explain activity_stats
method output, we define activity count,
active/non-active minute, and active/non-active bout.
?get_wear_flag
for wear/non-wear time detection details).Meta information:
n_days
- number of days (unique day dates) of data collection.n_valid_days
- number of days (unique day dates) of data
collection determined as valid days.wear_time_on_valid_days
- average number of wear-time minutes
across valid days.Summaries of PA volumes metrics:
tac
- TAC, Total activity counts per day - sum of AC measured on
valid days divided by the number of valid days.tlac
- TLAC, Total-log activity counts per day - sum of log(1+AC)
measured on valid days divided by the number of valid days. Here
‘log’ denotes the natural logarithm.ltac
- LTAC, Log-total activity counts - natural logarithm of TAC.time_spent_active
- Average number of active minutes per valid
day.time_spent_nonactive
- Average number of sedentary minutes per
valid day.Summaries of PA fragmentation metrics:
astp
- ASTP, active to sedentary transition probability on valid
days.satp
- SATP, sedentary to active transition probability on valid
days.no_of_active_bouts
- Average number of active minutes per valid
day.no_of_nonactive_bouts
- Average number of sedentary minutes per
valid day.mean_active_bout
- Average duration (in minutes) of an active bout
on valid days.mean_nonactive_bout
- Average duration (in minutes) of a sedentary
bout on valid days.activity_stats
method optionsThe subset_minutes
argument allows to specify subset of a day’s
minutes where activity summaries should be computed. There are 1440
minutes in a 24-hour day where 1
denotes 1st minute of the day (from
00:00 to 00:01), and 1440
denotes the last minute (from 23:59 to
00:00).
Here, we summarize PA observed between 12:00 AM and 6:00 AM.
subset_12am_6am <- 1 : (6 * 1440/24)
activity_stats(acc, acc_ts, subset_minutes = subset_12am_6am)
n\_days
n\_valid\_days
wear\_time\_on\_valid\_days
tac\_0to6only
tlac\_0to6only
ltac\_0to6only
8
4
1440
65477.5
322.1523
11.08946
astp\_0to6only
satp\_0to6only
time\_spent\_active\_0to6only
time\_spent\_nonactive\_0to6only
0.5581395
0.0200429
10.75
349.25
no\_of\_active\_bouts\_0to6only
no\_of\_nonactive\_bouts\_0to6only
mean\_active\_bout\_0to6only
mean\_nonactive\_bout\_0to6only
6
7
1.791667
49.89286
By default, column names have a suffix added to denote that a subset of
minutes was used (here, _0to6only
). This can be disabled by setting
adjust_out_colnames
to FALSE
.
subset_12am_6am = 1 : (6/24 * 1440)
subset_6am_12pm = (6/24 * 1440 + 1) : (12/24 * 1440)
subset_12pm_6pm = (12/24 * 1440 + 1) : (18/24 * 1440)
subset_6pm_12am = (18/24 * 1440 + 1) : (24/24 * 1440)
out <- rbind(
activity_stats(acc, acc_ts, subset_minutes = subset_12am_6am, adjust_out_colnames = FALSE),
activity_stats(acc, acc_ts, subset_minutes = subset_6am_12pm, adjust_out_colnames = FALSE),
activity_stats(acc, acc_ts, subset_minutes = subset_12pm_6pm, adjust_out_colnames = FALSE),
activity_stats(acc, acc_ts, subset_minutes = subset_6pm_12am, adjust_out_colnames = FALSE))
rownames(out) <- c("12am-6am", "6am-12pm", "12pm-6pm", "6pm-12am")
out
n\_days
n\_valid\_days
wear\_time\_on\_valid\_days
tac
tlac
ltac
12am-6am
8
4
1440
65477.5
322.1523
11.08946
6am-12pm
8
4
1440
1089788.5
2139.4534
13.90149
12pm-6pm
8
4
1440
994104.8
2194.8539
13.80960
6pm-12am
8
4
1440
677277.5
1773.3781
13.42584
astp
satp
time\_spent\_active
time\_spent\_nonactive
12am-6am
0.5581395
0.0200429
10.75
349.25
6am-12pm
0.1501377
0.1540616
181.50
178.50
12pm-6pm
0.1751337
0.1864162
187.00
173.00
6pm-12am
0.2037422
0.1032325
120.25
239.75
no\_of\_active\_bouts
no\_of\_nonactive\_bouts
mean\_active\_bout
mean\_nonactive\_bout
12am-6am
6.00
7.00
1.791667
49.892857
6am-12pm
27.25
27.50
6.660551
6.490909
12pm-6pm
32.75
32.25
5.709924
5.364341
6pm-12am
24.50
24.75
4.908163
9.686869
The exclude_minutes
argument allows to specify subset of a day’s
minutes excluded for computing activity summaries.
Here, we summarize PA while excluding observations between 11:00 PM and 5:00 AM.
subset_11pm_5am <- c(
(23 * 1440/24 + 1) : 1440, ## 11:00 PM - midnight
1 : (5 * 1440/24) ## midnight - 5:00 AM
)
activity_stats(acc, acc_ts, exclude_minutes = subset_11pm_5am)
n\_days
n\_valid\_days
wear\_time\_on\_valid\_days
tac\_23to5removed
tlac\_23to5removed
ltac\_23to5removed
8
4
1440
2735749
6052.84
14.82192
astp\_23to5removed
satp\_23to5removed
time\_spent\_active\_23to5removed
time\_spent\_nonactive\_23to5removed
0.1702018
0.1395057
483.25
596.75
no\_of\_active\_bouts\_23to5removed
no\_of\_nonactive\_bouts\_23to5removed
mean\_active\_bout\_23to5removed
mean\_nonactive\_bout\_23to5removed
82.25
83.25
5.87538
7.168168
The in_bed_time
and out_bed_time
arguments allow to provide
day-specific in-bed periods to be excluded from analysis.
Here, we summarize PA excluding in-bed time estimated by ActiLife software.
The ActiLife-estimated in-bed data file is attached to the arcstats
package. The sleep data columns relevant are,
Subject Name
- subject IDs coressponding to AC data, stored in
extdata_fnames
,In Bed Time
- ActiLife-estimated start of in-bed interval for each
day of the measurement,Out Bed Time
- ActiLife-estimated end of in-bed interval. ## Read sleep details data file
SleepDetails_fname <- "BatchSleepExportDetails_2020-05-01_14-00-46.csv"
SleepDetails_fpath <- system.file("extdata", SleepDetails_fname, package = "arcstats")
SleepDetails <- as.data.frame(fread(SleepDetails_fpath))
## Filter sleep details data to keep data correcponding to current counts data file
SleepDetails_sub <-
SleepDetails %>%
filter(`Subject Name` == "ID_1") %>%
mutate(`In Bed Time` = mdy_hms(`In Bed Time`),
`Out Bed Time` = mdy_hms(`Out Bed Time`)) %>%
select(`Subject Name`, `In Bed Time`, `Out Bed Time`)
SleepDetails_sub
#> Subject Name In Bed Time Out Bed Time
#> 1 ID_1 2018-07-13 21:18:00 2018-07-14 04:50:00
#> 2 ID_1 2018-07-14 22:41:00 2018-07-15 05:40:00
#> 3 ID_1 2018-07-16 19:46:00 2018-07-17 04:32:00
#> 4 ID_1 2018-07-17 23:30:00 2018-07-18 06:32:00
#> 5 ID_1 2018-07-18 22:16:00 2018-07-19 07:17:00
#> 6 ID_1 2018-07-19 22:30:00 2018-07-20 06:40:00
Finally, we use in/out-bed time POSIXct
vectors in activity_stats
method.
in_bed_time <- SleepDetails_sub[, "In Bed Time"]
out_bed_time <- SleepDetails_sub[, "Out Bed Time"]
activity_stats(acc, acc_ts, in_bed_time = in_bed_time, out_bed_time = out_bed_time)
n\_days
n\_valid\_days
wear\_time\_on\_valid\_days
tac\_inbedremoved
tlac\_inbedremoved
ltac\_inbedremoved
8
4
1440
2746582
6062.753
14.82587
astp\_inbedremoved
satp\_inbedremoved
time\_spent\_active\_inbedremoved
time\_spent\_nonactive\_inbedremoved
0.1703551
0.1580934
485.75
529.75
no\_of\_active\_bouts\_inbedremoved
no\_of\_nonactive\_bouts\_inbedremoved
mean\_active\_bout\_inbedremoved
mean\_nonactive\_bout\_inbedremoved
82.75
83.75
5.870091
6.325373
activity_stats
methodThe primary method activity_stats
is composed of several steps
implemented in their respective functions. Below, we demonstrate how to
produce activity_stats
results step by step with these functions.
We reuse the objects:
acc
- numeric vector; minute-level activity counts data,acc_ts
- POSIXct vector; minute-level time of acc
data
collection.df <- data.frame(acc = acc, acc_ts = acc_ts)
rbind(head(df, 3), tail(df, 3))
#> acc acc_ts
#> 1 2754 2018-07-13 10:00:00
#> 2 3009 2018-07-13 10:01:00
#> 3 3524 2018-07-13 10:02:00
#> 10078 0 2018-07-20 09:57:00
#> 10079 0 2018-07-20 09:58:00
#> 10080 0 2018-07-20 09:59:00
midnight_to_midnight
00:00-00:01
on the first day of data collection, and last
observation corresponds to minute of 23:50-00:00
on the last day
of data collection.NA
.Here, collected data cover total of 7*24*1440 = 10080
minutes (from
2018-07-13 10:00:00
to 2018-07-20 09:59:00
), but spans 8*24*1440
= 11520
minutes of full midnight-to-midnight days (from
2018-07-13 00:00:00
to 2018-07-20 23:59:00
).
acc <- midnight_to_midnight(acc = acc, acc_ts = acc_ts)
## Vector length on non NA-obs, vector length after acc
c(length(acc[!is.na(acc)]), length(acc))
#> [1] 10080 11520
get_wear_flag
Function get_wear_flag
computes wear/non-wear flag (1/0
) for each
minute of activity counts data. Method implements wear/non-wear
detection algorithm proposed by Choi et al. (2011).
1
for wear and 0
for non-wear
flagged minute.NA
entry in a data input vector, then the returned
vector will have a corresponding entry set to NA
too.wear_flag <- get_wear_flag(acc)
## Proportion of wear time across the days
wear_flag_mat <- matrix(wear_flag, ncol = 1440, byrow = TRUE)
round(apply(wear_flag_mat, 1, sum, na.rm = TRUE) / 1440, 3)
#> [1] 0.583 1.000 0.874 0.679 1.000 1.000 1.000 0.338
get_valid_day_flag
Function get_valid_day_flag
computes valid/non-valid day flag (1/0
)
for each minute of activity counts data.
Here, 4 out of 8 days have more that 10% (144 minutes) of missing data.
valid_day_flag <- get_valid_day_flag(wear_flag)
## Compute number of valid days
valid_day_flag_mat <- matrix(valid_day_flag, ncol = 1440, byrow = TRUE)
apply(valid_day_flag_mat, 1, mean, na.rm = TRUE)
#> [1] 0 1 0 0 1 1 1 0
impute_missing_data
Here, all four valid days have 100% of wear time. We hence demonstrate
the impute_missing_data
method by artificially replacing 1h (4%) of a
valid day with non-wear and running impute_missing_data
function then.
Function impute_missing_data
imputes missing data from the “average
day profile”. An “average day profile” is computed as a minute-wise
average of AC across valid days.
## Copies of original objects for the purpose of demonstration
acc_cpy <- acc
wear_flag_cpy <- wear_flag
## Artificially replace 1h (4%) of a valid day with non-wear
repl_idx <- seq(from = 1441, by = 1, length.out = 60)
acc_cpy[repl_idx] <- 0
wear_flag_cpy[repl_idx] <- 0
## Impute data for minutes identified as non-wear in days identified as valid
acc_cpy_imputed <- impute_missing_data(acc_cpy, wear_flag_cpy, valid_day_flag)
## Compare mean activity count on valid days before and after imputation
c(mean(acc_cpy[which(valid_day_flag == 1)]),
mean(acc_cpy_imputed[which(valid_day_flag == 1)]))
#> [1] 1955.521 1957.186
summarize_PA
Finally, method summarize_PA
computes physical all PA characteristics.
activity_stats
, it returns a data frame with physical
activity summaries.activity_stats
, it accepts arguments to compute physical
activity summariessubset_minutes
),exclude_minutes
),in_bed_time
,
out_bed_time
). See ?summarize_PA
for more details.Here, we summarize PA with the default parameters, across all valid days.
summarize_PA(acc, acc_ts, wear_flag, valid_day_flag)
n\_days
n\_valid\_days
wear\_time\_on\_valid\_days
tac
tlac
ltac
8
4
1440
2826648
6429.838
14.8546
astp
satp
time\_spent\_active
time\_spent\_nonactive
0.1781782
0.0951621
499.5
940.5
no\_of\_active\_bouts
no\_of\_nonactive\_bouts
mean\_active\_bout
mean\_nonactive\_bout
89
89.5
5.61236
10.50838
It returns the same results as the activity_stats
function:
activity_stats(dat$vectormagnitude, ymd_hms(dat$timestamp))
n\_days
n\_valid\_days
wear\_time\_on\_valid\_days
tac
tlac
ltac
8
4
1440
2826648
6429.838
14.8546
astp
satp
time\_spent\_active
time\_spent\_nonactive
0.1781782
0.0951621
499.5
940.5
no\_of\_active\_bouts
no\_of\_nonactive\_bouts
mean\_active\_bout
mean\_nonactive\_bout
89
89.5
5.61236
10.50838
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.