episodes: Group dated events into episodes.

View source: R/episodes.R

episodesR Documentation

Group dated events into episodes.

Description

Dated events (records) within a certain duration of an index event are assigned to a unique group. Each group has unique ID and are described as "episodes". "episodes" can be "fixed" or "rolling" ("recurring"). Each episodes has a "Case" and/or "Recurrent" record while all other records within the group are either "Duplicates" of the "Case" or "Recurrent" event.

Usage

episodes(
  date,
  case_length = Inf,
  episode_type = "fixed",
  recurrence_length = case_length,
  episode_unit = "days",
  strata = NULL,
  sn = NULL,
  episodes_max = Inf,
  rolls_max = Inf,
  case_overlap_methods = 8,
  recurrence_overlap_methods = case_overlap_methods,
  skip_if_b4_lengths = FALSE,
  data_source = NULL,
  data_links = "ANY",
  custom_sort = NULL,
  skip_order = Inf,
  reference_event = "last_record",
  case_for_recurrence = FALSE,
  from_last = FALSE,
  group_stats = c("case_nm", "wind", "epid_interval"),
  display = "none",
  case_sub_criteria = NULL,
  recurrence_sub_criteria = case_sub_criteria,
  case_length_total = 1,
  recurrence_length_total = case_length_total,
  skip_unique_strata = TRUE,
  splits_by_strata = 1,
  batched = "semi"
)

links_wf_episodes(
  date,
  case_length = Inf,
  episode_type = "fixed",
  strata = NULL,
  sn = NULL,
  display = "none"
)

episodes_af_shift(
  date,
  case_length = Inf,
  sn = NULL,
  strata = NULL,
  group_stats = FALSE,
  episode_type = "fixed",
  data_source = NULL,
  episode_unit = "days",
  data_links = "ANY",
  display = "none"
)

Arguments

date

[date|datetime|integer|number_line]. Record date or period.

case_length

[integer|number_line]. Duration from an index event distinguishing one "Case" from another.

episode_type

[character]. Options are "fixed" (default) or "rolling". See Details.

recurrence_length

[integer|number_line]. Duration from an index event distinguishing a "Recurrent" event from its "Case" or prior "Recurrent" event.

episode_unit

[character]. Unit of time for case_length and recurrence_length. Options are "seconds", "minutes", "hours", "days" (default), "weeks", "months" or "years". See diyar::episode_unit.

strata

[atomic]. Subsets of the dataset. Episodes are created separately by each strata.

sn

[integer]. Unique record ID.

episodes_max

[integer]. Maximum number of episodes permitted within each strata.

rolls_max

[integer]. Maximum number of times an index event can recur. Only used if episode_type is "rolling".

case_overlap_methods

[character|integer]. Specific ways a period (record) most overlap with a "Case" event. See (overlaps).

recurrence_overlap_methods

[character|integer]. Specific ways a period (record) most overlap with a "Recurrent" event. See (overlaps).

skip_if_b4_lengths

[logical]. If TRUE (default), events before a lagged case_length or recurrence_length are skipped.

data_source

[character]. Source ID for each record. If provided, a list of all sources in each episode is returned. See epid_dataset slot.

data_links

[list|character]. data_source required in each epid. An episode without records from these data_sources will be unlinked. See Details.

custom_sort

[atomic]. Preferential order for selecting index events. See custom_sort.

skip_order

[integer]. End episode tracking in a strata when the an index event's custom_sort order is greater than the supplied skip_order.

reference_event

[character]. Specifies which of the records are used as index events. Options are "last_record" (default), "last_event", "first_record" or "first_event".

case_for_recurrence

[logical]. If TRUE, a case_length is applied to both "Case" and "Recurrent" events. If FALSE (default), a case_length is applied to only "Case" events.

from_last

[logical]. Track episodes beginning from the earliest to the most recent record (FALSE) or vice versa (TRUE).

group_stats

[character]. A selection of group metrics to return for each episode. Most are added to slots of the epid object. Options are NULL or any combination of "case_nm", "wind" and "epid_interval".

display

[character]. Display progress update and/or generate a linkage report for the analysis. Options are; "none" (default), "progress", "stats", "none_with_report", "progress_with_report" or "stats_with_report".

case_sub_criteria

[sub_criteria]. Additional nested match criteria for events in a case_length.

recurrence_sub_criteria

[sub_criteria]. Additional nested match criteria for events in a recurrence_length.

case_length_total

[integer|number_line]. Minimum number of matched case_lengths required for an episode.

recurrence_length_total

[integer|number_line]. Minimum number of matched recurrence_lengths required for an episode.

skip_unique_strata

[logical]. If TRUE, a strata with a single event is skipped.

splits_by_strata

[integer]. Split analysis into n parts. This typically lowers max memory usage but increases run time.

batched

[character]. Create and compare records in batches. Options are "yes", "no", and "semi". typically, the ("semi") option will have a higher max memory and shorter run-time while ("no") will have a lower max memory but longer run-time

Details

episodes() links dated records (events) that are within a set duration of each other in iterations. Every record is linked to a unique group (episode; epid object). These episodes represent occurrences of interest as specified by function's arguments and defined by a case definition.

Two main type of episodes are possible;

  • "fixed" - An episode where all events are within a fixed duration of an index event.

  • "rolling" - An episode where all events are within a recurring duration of an index event.

Every record in each episode is categorised as one of the following;

  • "Case" - Index event of the episode (without a nested match criteria).

  • "Case_CR" - Index event of the episode (with a nested match criteria).

  • "Duplicate_C" - Duplicate of the index event.

  • "Recurrent" - Recurrence of the index event (without a nested match criteria).

  • "Recurrent_CR" - Recurrence of the index event (with a nested match criteria).

  • "Duplicate_R" - Duplicate of the recurrent event.

  • "Skipped" - Skipped records.

If data_links is supplied, every element of the list must be named "l" (links) or "g" (groups). Unnamed elements are assumed to be "l".

  • If named "l", groups without records from every listed data_source will be unlinked.

  • If named "g", groups without records from any listed data_source will be unlinked.

All records with a missing (NA) strata or date are skipped.

Wrapper functions or alternative implementations of episodes() for specific use cases or benefits:

  • episodes_wf_splits() - Identical records are excluded from the main analysis.

  • episodes_af_shift() - A mostly vectorised approach.

  • links_wf_episodes() - The same functionality achieved with links.

See vignette("episodes") for further details.

Value

epid; list

See Also

episodes_wf_splits; custom_sort; sub_criteria; epid_length; epid_window; partitions; links; overlaps;

Examples

data(infections)
data(hospital_admissions)

# One 16-day (15-day difference) fixed episode per type of infection
episodes(date = infections$date,
         strata = infections$infection,
         case_length = 15,
         episodes_max = 1,
         episode_type = "fixed")

# Multiple 16-day episodes with an 11-day recurrence period
episodes(date = infections$date,
         strata = NULL,
         case_length = 15,
         episodes_max = Inf,
         episode_type = "rolling",
         recurrence_length = 10)

# Overlapping periods of hospital stays
dfr <- hospital_admissions[2:3]

dfr$admin_period <-
  number_line(dfr$admin_dt,dfr$discharge_dt)

dfr$ep <-
  episodes(date = dfr$admin_period,
           strata = NULL,
           case_length = index_window(dfr$admin_period),
           case_overlap_methods = "inbetween")

dfr
as.data.frame(dfr$ep)


diyar documentation built on Nov. 13, 2023, 1:08 a.m.