View source: R/prepare_ukb_data.R
harmonize_ukb_data | R Documentation |
This function reads ukb files including main dataset (ukbxxxxx.tab) and record based (.txt) files from data portal. It returns a list of 3 elements: lst.data: event tables by episodes organized by source as well as classification system dfukb: the main dataset with columns required for generating lst.data vct.identifiers: a vector of participant identifiers from dfukb
harmonize_ukb_data(
f.ukbtab = NULL,
f.html = NULL,
dfDefinitions = NULL,
f.hesin = NULL,
f.hesin_diag = NULL,
f.hesin_oper = NULL,
f.death_portal = NULL,
f.death_cause_portal = NULL,
f.gp_clinical = NULL,
f.gp_scripts = NULL,
f.withdrawal_list = NULL,
allow_missing_fields = TRUE,
death_from_portal = TRUE,
add_extra_hesin_columns = F,
...
)
f.ukbtab |
Path to the main dataset (.tab) file |
f.html |
Path to html file containing the metadata of the main dataset which can be generated using ukb utility |
dfDefinitions |
A processed and expanded definition table (data.table object), which can be generated by |
f.hesin |
Path to HESIN (master file), RECORD LEVEL DATA |
f.hesin_diag |
Path to HESIN_DIAG file containing diagnosis codes, RECORD LEVEL DATA |
f.hesin_oper |
Path to HESIN_OPER file containing Operations and procedural codes, RECORD LEVEL DATA |
f.death_portal |
Path to file with DEATH table, RECORD LEVEL DATA |
f.death_cause_portal |
Path to file with DEATH_CAUSE table, RECORD LEVEL DATA |
f.gp_clinical |
Path to GP clinical event records, RECORD LEVEL DATA |
f.gp_scripts |
Path to GP prescription event records, RECORD LEVEL DATA |
f.withdrawal_list |
Path to participant withdrawal list (.csv) |
allow_missing_fields |
Logical flag specifying whether missing data field(s) is allowed (ignored) by the function. If FALSE, function will halt if any field is missing from the main dataset |
death_from_portal |
Logical flag specifying whether death records will be read from data portal files and from the main dataset. The main dataset will be taken if the files from data portal are not present (readable). |
add_extra_hesin_columns |
if True, adds extra columns "ins_index","source" |
main dataset as dataframe with only selected data fields
lst.harmonized.data<-harmonize_ukb_data(f.ukbtab = fukbtab,f.html = fhtml,f.gp_clinical = fgp_clinical,f.gp_scripts = fgp_scripts,f.hesin = fhesin,f.hesin_diag = fhesin_diag,f.hesin_oper =fhesin_oper,f.death_portal = fdeath_portal,f.death_cause_portal = fdeath_cause_portal )
summary(lst.harmonized.data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.