derive_ethnic_background_simplified: Derive simplified ethnic background

View source: R/derived_variables.R

derive_ethnic_background_simplifiedR Documentation

Derive simplified ethnic background

Description

Simplifies ethnic background in a UK Biobank main dataset to the main categories for Field ID 21000.

Usage

derive_ethnic_background_simplified(
  ukb_main,
  ukb_data_dict = get_ukb_data_dict(),
  ethnicity_levels = c("White", "Mixed", "Asian or Asian British",
    "Black or Black British", "Chinese", "Other ethnic group"),
  .drop = FALSE,
  .details_only = FALSE
)

Arguments

ukb_main

A UK Biobank main dataset.

ukb_data_dict

The UKB data dictionary (available online at the UK Biobank data showcase. This should be a data frame where all columns are of type character.

ethnicity_levels

The factor level order for the appended ethnic_background_simplified column. By default, the baseline level is set to "White" ethnicity.

.drop

If TRUE, remove the required input columns from the result

.details_only

If TRUE, return a list containing details of required input variables (Field IDs) and derived variables (new column name, label and values/value labels).

Details

Categories "Do not know" and "Prefer not to answer" are converted to NA. A new column called ethnic_background_simplified of type factor is appended to the input data frame. By default, "White" ethnicity is set to the baseline level as this is the largest category. Levels can be explicitly specified using the ethnicity_levels argument.

Value

A data frame with a column called ethnic_background_simplified (type factor).

Examples

library(magrittr)
# dummy UKB data and data dictionary
dummy_ukb_data_dict <- get_ukb_dummy("dummy_Data_Dictionary_Showcase.tsv")
dummy_ukb_codings <- get_ukb_dummy("dummy_Codings.tsv")

dummy_ukb_main <- read_ukb(
  path = get_ukb_dummy("dummy_ukb_main.tsv", path_only = TRUE),
  ukb_data_dict = dummy_ukb_data_dict,
  ukb_codings = dummy_ukb_codings
)

# derive ethnic background
derive_ethnic_background_simplified(
  ukb_main = dummy_ukb_main,
  ukb_data_dict = dummy_ukb_data_dict
) %>%
  dplyr::select(tidyselect::contains("ethnic"))

rmgpanw/ukbwranglr documentation built on April 30, 2024, 7:47 a.m.