derive_ethnic_background_simplified: Derive simplified ethnic background
In rmgpanw/ukbwranglr: Exploring UKB Data

derive_ethnic_background_simplified

R Documentation

Derive simplified ethnic background

Description

Simplifies ethnic background in a UK Biobank main dataset to the main categories for Field ID 21000.

Usage

derive_ethnic_background_simplified(
  ukb_main,
  ukb_data_dict = get_ukb_data_dict(),
  ethnicity_levels = c("White", "Mixed", "Asian or Asian British",
    "Black or Black British", "Chinese", "Other ethnic group"),
  .drop = FALSE,
  .details_only = FALSE
)

Arguments

`ukb_main`	A UK Biobank main dataset.
`ukb_data_dict`	The UKB data dictionary (available online at the UK Biobank data showcase. This should be a data frame where all columns are of type `character`.
`ethnicity_levels`	The factor level order for the appended `ethnic_background_simplified` column. By default, the baseline level is set to "White" ethnicity.
`.drop`	If `TRUE`, remove the required input columns from the result
`.details_only`	If `TRUE`, return a list containing details of required input variables (Field IDs) and derived variables (new column name, label and values/value labels).

Details

Categories "Do not know" and "Prefer not to answer" are converted to NA. A new column called ethnic_background_simplified of type factor is appended to the input data frame. By default, "White" ethnicity is set to the baseline level as this is the largest category. Levels can be explicitly specified using the ethnicity_levels argument.

Value

A data frame with a column called ethnic_background_simplified (type factor).

Examples

library(magrittr)
# dummy UKB data and data dictionary
dummy_ukb_data_dict <- get_ukb_dummy("dummy_Data_Dictionary_Showcase.tsv")
dummy_ukb_codings <- get_ukb_dummy("dummy_Codings.tsv")

dummy_ukb_main <- read_ukb(
  path = get_ukb_dummy("dummy_ukb_main.tsv", path_only = TRUE),
  ukb_data_dict = dummy_ukb_data_dict,
  ukb_codings = dummy_ukb_codings
)

# derive ethnic background
derive_ethnic_background_simplified(
  ukb_main = dummy_ukb_main,
  ukb_data_dict = dummy_ukb_data_dict
) %>%
  dplyr::select(tidyselect::contains("ethnic"))

rmgpanw/ukbwranglr documentation built on April 30, 2024, 7:47 a.m.