instance_combiner: Instance combiner helper function

View source: R/instance_array_helper.R

instance_combinerR Documentation

Instance combiner helper function

Description

Helper to apply a lookup function to a range of instances, and combine the result.

Usage

instance_combiner(
  data,
  lookup_by_instance_fn,
  combine_instances = c("any", "max", "min", "first", "last", "mean"),
  up_to_instance = default_up_to_inst(),
  after_instance = default_after_inst()
)

Arguments

data

The primarydata frame.

Thisdata frame includes all necessary columns required to perform look-ups (e.g. ICD10 code columns, medication code columns, age, sex, etc.).

lookup_by_instance_fn

Function that takes a target instance as its only argument, and returns a vector of data.

combine_instances

The method used by instance_combiner() or other Reduce-like functions when combining results of a lookup (e.g. a medication lookup, icd10 lookup, biomarker lookup) across multiple instances.

For example, when looking up whether a participant is on a medication, the result may differ depending on the instance number. In such a case, one would want to apply the "any" method so that results for instance 2 will be Reduce()-ed with the or operator applied to the results of instances 0 and 1. In the case of numeric lookups (e.g. biomarkers that are recorded at multiple instances), one might want to use the "mean" method to average results across instances.

Can be one of:

  • "any" - use Boolean or (note: requires that lookup results are logical)

  • "min" - use the minimum non-NA value

  • "max" - use the maximum non-NA value

  • "first" - use the first/earliest non-NA value

  • "last" - use the last/latest non-NA value

  • "mean" - use the mean of non-NA values

Functions that call combine_instances() may restrict the choice of options (e.g. it doesn't make sense to apply "any" to numeric data).

up_to_instance

An integer specifying an instance number, or the name of the column (using <data-masking> rules) containing instance numbers to use as the maximum instance (inclusive). Used to include alldata up to and including the specified instance number. Defaults to default_up_to_inst(), which as of this writing returns 3.

after_instance

An integer specifying an instance number, or the name of the column (using <data-masking> rules) containing instance numbers to use as the minimum instance (non-inclusive). Used to include alldata after but NOT including the specified instance number. Defaults to default_after_inst(), which is typically -1 (i.e., include instances 0 and later).


adamleejohnson/R-ukbiobank documentation built on April 25, 2022, 2:11 a.m.