Description Usage Arguments Details Value CSV Formatting Examples
View source: R/load-data.R View source: R/load-data copy.R
This function filters and creates new variables from the study data. This function takes csv file inputs that detail the new variable names and formulas created from the variable names in the Stata file.
1 | get.adjust(stata, mods, filts = NULL)
|
stata |
Study (ARIC or HUNT) dataset in stata format |
mods |
List of paths to formatted csv files creating new adjustment
variables. Format of csv is detailed below. Files should be titled
|
filts |
List of path (singular) to formatted csv of the data filters to be applied to the dataset |
This function uses csv files to inject commands into
dplyr::mutate()
and dplyr::filter()
functions. The format of the csv
file to properly execute the desired commands is detailed below. The
mods
parameter is required and must be a list of path(s) to the csv
files. Convention is to include one file with the outcome
variables–including heart failure diagnosis variables and time to event
variables and another file including adjustment variables, such as
demographics and clinical history. The filter csv should include filtering
conditions, such as excluding patients with prevalent heart failure, and
the format of that csv is also detailed below.
Tibble with data from all variables found in the mods csv files.
Files must include a header for three columns–Variable, Expression, Label. Then, a new row must be created for each new variable, its expression, and its label. Expressions are R code snippets that calculate variables. Data in tibble should be assumed to be attached–same as when tidyverse formats its variables.
!! First row in adjustment variables must always be the study identifier for merging purposes. !!
Variable | Expression | Label |
id | id | ARIC COHORT STUDY ID |
age | v5age51 | Visit 5 Age |
bmi | v5_bmi51 | Visit 5 BMI |
race | as.factor(race == 1) | Subject Race (Black == 1) |
This table is coded in csv as:
Variable, Expression, Label
id, id, ARIC COHORT STUDY ID
age, v5age51, Visit 5 Age
bmi, v5_bmi51, Visit 5 BMI,
race, as.factor(race == 1), Subject Race (Black == 1)
Variable | Expression | Label |
hfdiag | !is.na(adjudhf_bwh) | Incident Heart Failure Dx |
fuptime | as.double(adjudhfdate - v5date51) | V5 HF Follow Up Time |
This table is coded in csv as:
Variable, Expression, Label
hfdiag, !is.na(adjudhf_bwh), Incident Heart Failure Dx
fuptime, as.double(adjudhfdate - v5date51), V5 HF Follow Up Time
!! The First Row Must be the primary outcome, and the last row must be the primary time to event variable. This quirk may be patched in the coming versions.
These files filter data according to exclusion criteria. Most used to remove subjects with prevalent heart failure.
Files must include headers for three columns–Variable, Operator, Expression. These headers represent the new variable, its filtering operation (==, >, >=, <, <=, etc.), and the expression for this operation respectively.
Variable | Operator | Expression |
v5_prevhf52 | == | FALSE |
fuptime | > | 0 |
This table is coded in csv as:
Variable, Operator, Expression
v5_prevhf52, ==, FALSE
fuptime, >, 0
1 2 3 4 5 6 7 8 9 10 11 12 | ## Not run:
fifth.visit.study <-
get.adjust(stata = haven::read_dta('data/ARICmaster_121820.dta'),
mods = list('data/visit-five/outcomes.csv',
'data/visit-five/adjusted.csv'),
filts = list('data/visit-five/filters.csv'))
fifth.visit.echo <-
get.adjust(stata = haven::read_dta('data/ARICmaster_121820.dta'),
mods = list('data/visit-five/echovars.csv'))
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.