reweight_accel: Reweight NHANES accelerometry data

Description Usage Arguments Details Value Examples

View source: R/process_accel.R

Description

This function re-weights accelerometry data for NHANES 2003-2004,2005-2006 waves.

Usage

1
2
3
4
5
6
reweight_accel(
  data,
  return_unadjusted_wts = TRUE,
  age_bks = c(0, 1, 3, 6, 12, 16, 20, 30, 40, 50, 60, 70, 80, 85, Inf),
  right = FALSE
)

Arguments

data

Data frame to with survey weights to be re-weighted. Should not contain any duplicated participants. That is, each row of this dataframe should correspond to a unique value of SEQN. The data frame supplied to data must have the columns: SEQN", SDDSRVYR,WTMEC2YR, and WTINT2YR.

return_unadjusted_wts

Logical value indicating whether to return the unadjusted 2-year and, if applicable, 4-year survey weights for all participants.

age_bks

Vector of ages which define the intervals used for re-weighting. This argument is passed to the cut function to create age categories which are in turn used to re-weight participants. The argument "right" determines whether these intervals will be closed on the right or the left.

right

Logical value indicating whether the age intervals defined by the "age_bks" arguement should be closed on the left (right=FALSE) or the right (right=TRUE). See cut for additional details and examples. Defaults to TRUE.

Details

The reweight_accel function is designed to re-weight only the 2003-2004 and 2005-2006 waves in the context of missing data. This function calculates 2- and 4- year adjusted and unadjusted survey weights. The re-weighting is performed using age, sex, and ehtnicity strata applied to each wave separately. More specifically, individuals in the data frame supplied to the function via the "data" argument are upweighted by a factor such that the sum of their weights is equal to the total survey weight in the population strata. If data are missing completely at random within each of these strata, then these re-weighted strata are representative of the corresponding strata in the larger study.

Users should ensure that if they intend to use the adjusted weights calculated by this function, that the data they reweight aligns with the re-weighted strategy, particularly with regard to age. That is, it does make sense to reweight all individuals 58-60 to be representative of all individuals ages 50-60. The age categories used in re-weighting are controlled by the "age_bks" argument. In illustrate the problems of misalignment of ages in the examples below. Moreover, the re-weighting is done separately for the interview and examination weights. Because there is a time lag between the interview and the exam, individuals may belong to different age strata for the purposes of re-weighting the interview and examination survey weights. Therefore, users need to make sure the ages in their data align with the survey weight they intend to use.

It is possible that if there are one or more strata that are sparse, the survey weights. Users should always inspect the adjusted survey weights for outliers.

Value

The function reweight_accel will return a dataframe with the same columns as the data frame supplied to the "data" argument with either 8 or 16 additional columns. If the data supplied to the reweight_accel function only comes from one NHANES wave, then only the 2-year survey weights will be returned. If there are data from both the 2003-2004 and 2005-2006 waves supplied to the reweight_accel function, then both the 2-year and 4-year survey weights will be returned. Any time an analysis is done using the combined data, the appropriate 4-year survey weight should be used.

These survey weights are described below.

If any of the 14 columns described above are already in the dataframe supplied to the data argument, they will be overwritten and a warning will be printed to the console. This may occur when an individual subsets their data multiple times and re-weights at each step.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
## Not run: 
library("rnhanesdata")
set.seed(1241)
## load the 2003-2004 demographic data
data("Covariate_C")

## consider just those individuals between the ages in the interval [50,80)
## at the exam portion of the study
df50 <- subset(Covariate_C, RIDAGEEX/12 >= 50 & RIDAGEEX/12 <80)

## subsample 75% of these individuals, then re-weight the data
df50_sub <- df50[sample(1:nrow(df50), replace=FALSE, size=floor(nrow(df50)*0.75)),]
df50_rw  <- reweight_accel(df50)

## check the unadjusted weights 2-year weights match the WTMEC2YR variable
sum(df50_rw$WTMEC2YR != df50_rw$wtmec2yr_unadj)

## See that the adjusted interview weights are massively inflated
## This is because there are individuals who are in the [40,50) strata during the interview
## by are in the [50,60) strata for the exam. These few individuals are upweighted to
## "represent" all individuals [50,60) during the interview, which clearly doesn't make sense.
summary(df50_rw$wtint2yr_adj)

## Subsetting the reweighted dataset





## End(Not run)

andrew-leroux/rnhanesdata documentation built on March 6, 2020, 11:35 p.m.