process_nhanes: Process NHANES 2003-2006 Accelerometer Data

Description Usage Arguments Details Value References Examples

View source: R/process_nhanes.R

Description

Calculates a variety of physical activity variables from the time-series accelerometer data in NHANES 2003-2006. A data dictionary for the variables created is available here: https://vandomed.github.io/process_nhanes_dictionary.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
process_nhanes(
  waves = 3,
  directory = getwd(),
  nci_methods = FALSE,
  brevity = 1,
  hourly_var = "cpm",
  hourly_wearmin = 0,
  hourly_normalize = FALSE,
  valid_days = 1,
  valid_wk_days = 0,
  valid_we_days = 0,
  int_cuts = c(100, 760, 2020, 5999),
  youth_mod_cuts = rep(int_cuts[3], 12),
  youth_vig_cuts = rep(int_cuts[4], 12),
  cpm_nci = FALSE,
  days_distinct = FALSE,
  nonwear_window = 60,
  nonwear_tol = 0,
  nonwear_tol_upper = 99,
  nonwear_nci = FALSE,
  weartime_minimum = 600,
  weartime_maximum = 1440,
  active_bout_length = 10,
  active_bout_tol = 0,
  mvpa_bout_tol_lower = 0,
  vig_bout_tol_lower = 0,
  active_bout_nci = FALSE,
  sed_bout_tol = 0,
  sed_bout_tol_maximum = int_cuts[2] - 1,
  artifact_thresh = 25000,
  artifact_action = 1,
  weekday_weekend = FALSE,
  return_form = "averages",
  write_csv = FALSE
)

Arguments

waves

Integer value for which wave of data to process. Choices are 1 for NHANES 2003-2004, 2 for NHANES 2005-2006 data, and 3 for both.

directory

Character string specifying directory in which to write .csv file, if write_csv = TRUE.

nci_methods

Logical value for whether to set all arguments so as to replicate the data processing methods used in the NCI's SAS programs. More specifically:

valid_days = 4

valid_wk_days = 0

valid_we_days = 0

int_cuts = c(100, 760, 2020, 5999)

youth_mod_cuts = c(1400, 1515, 1638, 1770, 1910, 2059, 2220, 2393, 2580, 2781, 3000, 3239)

youth_vig_cuts = c(3758, 3947, 4147, 4360, 4588, 4832, 5094, 5375, 5679, 6007, 6363, 6751)

cpm_nci = TRUE

days_distinct = TRUE

nonwear_window = 60

nonwear_tol = 2

nonwear_tol_upper = 100

nonwear_nci = TRUE

weartime_minimum = 600

weartime_maximum = 1440

active_bout_length = 10

active_bout_tol = 2

mvpa_bout_tol_lower = 0

vig_bout_tol_lower = 0

active_bout_nci = TRUE

sed_bout_tol = 0

sed_bout_tol_maximum = 759

artifact_thresh = 32767

artifact_action = 3

If TRUE, you can still specify non-default values for brevity, weekday_weekend, and return_form.

brevity

Integer value controlling the number of physical activity variables generated. Choices are 1 for basic indicators of physical activity volume, 2 for addditional indicators of activity intensities, activity bouts, sedentary behavior, and peak activity, and 3 for additional hourly count averages.

hourly_var

Character string specifying what hourly activity variable to record, if brevity = 3. Choices are "counts", "cpm", "sed_min", "sed_bouted_10min", and "sed_breaks".

hourly_wearmin

Integer value specifying minimum number of wear time minutes needed during a given hour to record a value for the hourly activity variable.

hourly_normalize

Logical value for whether to normalize hourly activity by number of wear time minutes.

valid_days

Integer value specifying minimum number of valid days to be considered valid for analysis.

valid_wk_days

Integer value specifying minimum number of valid weekdays to be considered valid for analysis.

valid_we_days

Integer value specifying minimum number of valid weekend days to be considered valid for analysis.

int_cuts

Numeric vector with four cutpoints from which five intensity ranges are derived. For example, int_cuts = c(100, 760, 2020, 5999) creates: 0-99 = intensity 1; 100-759 = intensity level 2; 760-2019 = intensity 3; 2020-5998 = intensity 4; >= 5999 = intensity 5. Intensities 1-5 are typically viewed as sedentary, light, lifestyle, moderate, and vigorous.

youth_mod_cuts

Integer vector of 12 count cutpoints for classifying moderate physical activity in youth, for ages 6, 7, ..., 17. To replicate the NCI's SAS programs, set youth_mod_cuts = c(1400, 1515, 1638, 1770, 1910, 2059, 2220, 2393, 2580, 2781, 3000, 3239).

youth_vig_cuts

Integer vector of 12 count cutpoints for classifying vigorous physical activity in youth, for ages 6, 7, ..., 17. To replicate the NCI's SAS programs, set youth_vig_cuts = c(3758, 3947, 4147, 4360, 4588, 4832, 5094, 5375, 5679, 6007, 6363, 6751).

cpm_nci

Logical value for whether to calculate average counts per minute by dividing average daily counts by average daily wear time, as opposed to taking the average of each day's counts per minute value. Strongly recommend leave as FALSE unless you wish to replicate the NCI's SAS programs.

days_distinct

Logical value for whether to treat each day of data as distinct, as opposed to analyzing the entire monitoring period as one continuous segment.

nonwear_window

Integer value specifying minimum length of a non-wear period.

nonwear_tol

Integer value specifying tolerance for non-wear algorithm, i.e. number of minutes with non-zero counts allowed during a non-wear interval.

nonwear_tol_upper

Integer value specifying maximum count value for a minute with non-zero counts during a non-wear interval.

nonwear_nci

Logical value for whether to use non-wear algorithm from NCI's SAS programs.

weartime_minimum

Integer value specifying minimum number of wear time minutes for a day to be considered valid.

weartime_maximum

Integer value specifying maximum number of wear time minutes for a day to be considered valid. The default is 1440, but you may want to use a lower value (e.g. 1200) if participants were instructed to remove devices for sleeping, but often did not.

active_bout_length

Integer value specifying minimum length of an active bout.

active_bout_tol

Integer value specifying number of minutes with counts outside the required range to allow during an active bout. If non-zero and active_bout_nci = FALSE, specifying non-zero values for mvpa_bout_tol_lower and vig_bout_tol_lower is highly recommended. Otherwise minutes immediately before and after an active bout will tend to be classified as part of the bout.

mvpa_bout_tol_lower

Integer value specifying lower cut-off for count values outside of required intensity range for an MVPA bout.

vig_bout_tol_lower

Integer value specifying lower cut-off for count values outside of required intensity range for a vigorous bout.

active_bout_nci

Logical value for whether to use algorithm from the NCI's SAS programs for classifying active bouts.

sed_bout_tol

Integer value specifying number of minutes with counts outside sedentary range to allow during a sedentary bout.

sed_bout_tol_maximum

Integer value specifying upper cut-off for count values outside sedentary range during a sedentary bout.

artifact_thresh

Integer value specifying the smallest count value that should be considered an artifact.

artifact_action

Integer value controlling method of correcting artifacts. Choices are 1 to exclude days with one or more artifacts, 2 to lump artifacts into non-wear time, 3 to replace artifacts with the average of neighboring count values, and 4 to take no action.

weekday_weekend

Logical value for whether to calculate averages for weekdays and weekend days separately (in addition to all valid days).

return_form

Character string controlling how variables are returned. Choices are "daily" for per-day summaries, "averages" for averages across all valid days, and "both" for a list containing both.

write_csv

Logical value for whether to write the results to a .csv file in directory.

Details

As an alternative to using this function programmatically, you can use the process_nhanes_app function to access a GUI. Just run process_nhanes_app() in R.

Value

Data frame or list of two data frames, depending on return_form.

References

Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD: US Department of Health and Human Services, Centers for Disease Control and Prevention, 2003-6. https://wwwn.cdc.gov/nchs/nhanes/Default.aspx. Accessed Jan. 7, 2019.

National Cancer Institute. Risk factor monitoring and methods: SAS programs for analyzing NHANES 2003-2004 accelerometer data. Available at: http://riskfactor.cancer.gov/tools/nhanes_pam. Accessed Jan. 7, 2019.

Van Domelen, D.R. (2018) accelerometry: Functions for processing accelerometer data. R package version 3.1.2. http://CRAN.R-project.org/package=accelerometry.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Process NHANES 2003-2006 data using default settings
nhanes1 <- process_nhanes()

# Process NHANES 2003-2004 with following non-default settings: require >= 4
# valid days, use 90- rather than 60-minute window for non-wear algorithm,
# and request averages across all days and for weekdays/weekends separately
nhanes2 <- process_nhanes(
  waves = 1,
  valid_days = 4,
  nonwear_window = 90,
  weekday_weekend = TRUE
)

# Process data according to methods used in NCI's SAS programs
youth_mod_cuts <- c(1400, 1515, 1638, 1770, 1910, 2059, 2220, 2393, 2580,
                    2781, 3000, 3239)
youth_vig_cuts <- c(3758, 3947, 4147, 4360, 4588, 4832, 5094, 5375, 5679,
                    6007, 6363, 6751)
nhanes3 <- process_nhanes(
  waves = 3,
  brevity = 2,
  valid_days = 4,
  youth_mod_cuts = youth_mod_cuts,
  youth_vig_cuts = youth_vig_cuts,
  cpm_nci = TRUE,
  days_distinct = TRUE,
  nonwear_tol = 2,
  nonwear_tol_upper = 100,
  nonwear_nci = TRUE,
  weartime_maximum = 1440,
  active_bout_tol = 2,
  active_bout_nci = TRUE,
  artifact_thresh = 32767,
  artifact_action = 3
)

# Repeat, but use nci_methods input for convenience
nhanes4 <- process_nhanes(
  waves = 3,
  brevity = 2,
  nci_methods = TRUE
)

# Results are identical
all.equal(nhanes3, nhanes4)

vandomed/nhanesaccel documentation built on Aug. 4, 2020, 5:22 p.m.