View source: R/clinical_events.R
tidy_clinical_events | R Documentation |
Data in a UK Biobank main dataset is stored in wide format i.e. a single row of data per UK Biobank participant ('eid's). Clinical events may be ascertained from numerous sources (e.g. self-reported medical conditions, linked hospital records) with coded events and their associated dates recorded across multiple columns. This function tidies this data into a standardised long format table.
tidy_clinical_events(
ukb_main,
ukb_data_dict = get_ukb_data_dict(),
ukb_codings = get_ukb_codings(),
clinical_events_sources = c("primary_death_icd10", "secondary_death_icd10",
"self_report_medication", "self_report_non_cancer", "self_report_non_cancer_icd10",
"self_report_cancer", "self_report_operation", "cancer_register_icd9",
"cancer_register_icd10", "summary_hes_icd9", "summary_hes_icd10",
"summary_hes_opcs3", "summary_hes_opcs4"),
strict = TRUE,
.details_only = FALSE
)
ukb_main |
A UK Biobank main dataset. |
ukb_data_dict |
The UKB data dictionary (available online at the UK
Biobank
data
showcase. This should be a data frame where all columns are of type
|
ukb_codings |
The UKB codings file (available online at the UK Biobank
data
showcase. This should be a data frame where all columns are of type
|
clinical_events_sources |
A character vector of clinical events sources to tidy. By default, all available options are included. |
strict |
If |
.details_only |
If |
A named list of data frames is returned, with the names corresponding to the
data sources specified by clinical_events
. Each data frame has the
following columns:
eid
- participant identifier
source
- the
FieldID (prefixed by 'f') where clinical codes were extracted from. See
clinical_events_sources
for further details.
index
the corresponding instance and array (e.g. '0-1' means instance 0 and array
code
- clinical code. The type of clinical codings system
used depends on source
.
date
- associated date. Note that
in cases where participants self-reported a medical condition but recorded
the date as either 'Date uncertain or unknown' or 'Preferred not to answer'
(see data coding
13) then the date is set to NA
.
A named list of clinical events data frames.
Results may be combined into a single data frame using
bind_rows
.
Other clinical events:
clinical_events_sources()
,
example_clinical_codes()
,
extract_phenotypes()
,
make_clinical_events_db()
# dummy UKB main dataset and metadata
dummy_ukb_main <- get_ukb_dummy("dummy_ukb_main.tsv")
dummy_ukb_data_dict <- get_ukb_dummy("dummy_Data_Dictionary_Showcase.tsv")
dummy_ukb_codings <- get_ukb_dummy("dummy_Codings.tsv")
# tidy clinical events in a UK Biobank main dataset
clinical_events <- tidy_clinical_events(
ukb_main = dummy_ukb_main,
ukb_data_dict = dummy_ukb_data_dict,
ukb_codings = dummy_ukb_codings
)
# returns a named list of data frames, one for each `clinical_events_source`
names(clinical_events)
clinical_events$summary_hes_icd10
# use .details_only = TRUE to return details of required Field IDs for
# specific clinical_events sources
tidy_clinical_events(.details_only = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.