measurement_lookup: General measurement lookups

View source: R/measurements.R

measurement_lookupR Documentation

General measurement lookups

Description

Perform lookups for various measurements. Because many measurements are repeated across instances, and sometimes within instances, two special arguments (combine_instances and combine_array) are used to specify how multiple measurements should be aggregated.

The measurement_lookup() function is the general function for lookups. The measurement_lookup_with_alt() function is similar, but also looks for an alternate value if the primary value is NA. This is common when manual measurements (of, for example, blood pressure) are made after an automated method fails.

Several convenience physio_*() functions are provided to facilitate lookup of commonly used measurements (e.g., blood pressure, BMI, etc.).

Usage

measurement_lookup(
  data,
  measurement_col,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = c("last", "first", "min", "max", "mean"),
  combine_array = c("last", "first", "min", "max", "mean")
)

measurement_lookup_with_alt(
  data,
  measurement_col,
  measurement_col_alt,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = c("last", "first", "min", "max", "mean"),
  combine_array = c("last", "first", "min", "max", "mean")
)

physio_systolicBP(
  data,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = "last",
  combine_array = "mean",
  measurement_col = f.4080.0.0.Systolic_blood_pressure_automated_reading,
  measurement_col_alt = f.93.0.0.Systolic_blood_pressure_manual_reading
)

physio_diastolicBP(
  data,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = "last",
  combine_array = "mean",
  measurement_col = f.4079.0.0.Diastolic_blood_pressure_automated_reading,
  measurement_col_alt = f.94.0.0.Diastolic_blood_pressure_manual_reading
)

physio_height_cm(
  data,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = "last",
  combine_array = "mean",
  measurement_col = f.50.0.0.Standing_height
)

physio_weight_kg(
  data,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = "last",
  combine_array = "mean",
  measurement_col = f.21002.0.0.Weight
)

physio_bmi(
  data,
  measurement_col = f.21001.0.0.Body_mass_index_BMI,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = "last",
  combine_array = "mean"
)

physio_bsa(
  data,
  after_instance = default_after_inst(),
  up_to_instance = default_up_to_inst(),
  combine_instances = "last",
  combine_array = "mean",
  height_col = f.50.0.0.Standing_height,
  weight_col = f.21002.0.0.Weight
)

Arguments

data

The primarydata frame.

Thisdata frame includes all necessary columns required to perform look-ups (e.g. ICD10 code columns, medication code columns, age, sex, etc.).

measurement_col

Template column name for general measurements.
Example = f.4080.0.0.Systolic_blood_pressure_automated_reading.

after_instance

An integer specifying an instance number, or the name of the column (using <data-masking> rules) containing instance numbers to use as the minimum instance (non-inclusive). Used to include alldata after but NOT including the specified instance number. Defaults to default_after_inst(), which is typically -1 (i.e., include instances 0 and later).

up_to_instance

An integer specifying an instance number, or the name of the column (using <data-masking> rules) containing instance numbers to use as the maximum instance (inclusive). Used to include alldata up to and including the specified instance number. Defaults to default_up_to_inst(), which as of this writing returns 3.

combine_instances

The method used by instance_combiner() or other Reduce-like functions when combining results of a lookup (e.g. a medication lookup, icd10 lookup, biomarker lookup) across multiple instances.

For example, when looking up whether a participant is on a medication, the result may differ depending on the instance number. In such a case, one would want to apply the "any" method so that results for instance 2 will be Reduce()-ed with the or operator applied to the results of instances 0 and 1. In the case of numeric lookups (e.g. biomarkers that are recorded at multiple instances), one might want to use the "mean" method to average results across instances.

Can be one of:

  • "any" - use Boolean or (note: requires that lookup results are logical)

  • "min" - use the minimum non-NA value

  • "max" - use the maximum non-NA value

  • "first" - use the first/earliest non-NA value

  • "last" - use the last/latest non-NA value

  • "mean" - use the mean of non-NA values

Functions that call combine_instances() may restrict the choice of options (e.g. it doesn't make sense to apply "any" to numeric data).

combine_array

If a measurement field has multiple array values (e.g. blood pressure recordings are made in duplicate), specify how these values should be combined. See combine_instances for details and options.

measurement_col_alt

Alternate template column name, used specifically for manual readings of measurements when automated methods return NA.

height_col

Template column name for height.
Default = f.50.0.0.Standing_height.

weight_col

Template column name for weight.
Default = f.21002.0.0.Weight.

Instancing

The UK Biobank records visits as separate "instances." As of this writing, there are 4 instances labeled 0 through 3. At each instance, various information can be recorded or re-recorded. For example, blood pressure is typically recorded at most in-person evaluations. Therefore, there may be 4 separate columns for blood pressure recordings (actually, there could be more because the blood pressure may be recorded multiple times at each instance). Almost all of the functions in this package will utilize instance numbers to specify from which time pointsdata should be retrieved. For example, we may want to know the state of ICD10 diagnoses before instance 2. In this case, we would specify up_to_instance = 1 (search up to instance 1, inclusive) in functions that take this as an argument.

Arguments like up_to_instance and after_instance can take a constant instance number as their value. But they can also take the name of a column that contains an instance number, so that different instance limits can be used for each participant. For example, some participants undergo MRI at instance 2, and others at instance 3. If we want to know the state of a diagnosis up to and including the time of MRI, we would want to assign up_to_instance to the name of the column that specifies which instance the MRI occurred at. This column typically has to be generated by the user and attached to thedata frame beforehand.

Column names

Most functions in this package will take column names (*_col) as optional arguments (otherwise a default column names are used) which are then used as templates to find all other columns with the same field number, but different instance and array numbers. These functions will automatically find all matching instances (and arrays within each instance) within the specified parameters. Internally, a set of expand_instance_*() helper functions, which themselves rely on the column_expansion_helper() function, perform the work of searching for matching columns.

See Also

biomarker_lookup()


adamleejohnson/R-ukbiobank documentation built on April 25, 2022, 2:11 a.m.