compile_mh_outcome: Construct unified data as a TidySet

Description Usage Arguments Details Value Examples

View source: R/compile_mh_outcome-function.R

Description

This function compiles medical history and the outcome table into a TidySet (i.e. ExpressionSet). One example of the output can be called by data(medhistdata).

Usage

1
compile_mh_outcome(mh_table, outcome)

Arguments

mh_table

Target population data, a data frame with rows for visits and standardized columns (please see Details below). This is an output of extract_medical_history().

outcome

Subject list, a data frame with rows for unique subjects and columns of subject_id, latest_date, and outcome. The last column is a factor of which non-event is the first class between non-event and event.

Details

Target population data consisted visit_id,subject_id, healthcare_id, admission_date, and db_start_date beyond columns for medical histories. Columns of visit_id,subject_id, and healthcare_id are characters of identification numbers idenitfying respectively unique visits, subjects, and healthcare providers. Columns of admission_date and db_start_date are dates of subject admission in a visit and database start of recording period, respectively. The remaining columns are named using ICD-10 code for either diagnosis or procedure regardless the number of digit or k-mer. Each code are spread as a column consisting the number of days from the latest admission_date on which this code have ever been encountered to admission_date of each visit. If no encounter is found, then the value returns NA to denote censored data. This is because the code may be encountered before db_start_date. If the latest date and that of each visit are the same, then the value returns 0.

Value

A TidySet (i.e. ExpressionSet) containing the visits of subjects in medical history and the outcome datasets.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
## Create input example
data(visit_cap)
data(visit_ffs)
data(visit_drg)
data(diagnosis)

population=
  list(visit_cap,visit_ffs,visit_drg) %>%
  lapply(select,visit_id,subject_id,healthcare_id,admission_date) %>%
  do.call(rbind,.) %>%
  left_join(diagnosis,by='visit_id') %>%
  filter(!code_type%in%c('Admission diagnosis')) %>%
  select(-code_type) %>%
  mutate(db_start_date=as.Date('2015-01-01')) %>%
  .[!duplicated(.),]

## Extract outcome of subjects and sample some of them
outcome=
  extract_outcome(population,'O1[4-5]',min,-1,'Z3[3-7]',max,0) %>%
  group_by(outcome) %>%
  slice(sample(seq(n()),ceiling(n()*0.0125),F)) %>%
  ungroup()

## Filter medical history before the date of either event or non-event
input=
  outcome %>%
  right_join(population,by='subject_id') %>%
  select(visit_id, everything()) %>%
  filter(admission_date<latest_date) %>%
  select(-outcome,-latest_date)

## Extract medical history of subjects per healthcare provider
mh_table=extract_medical_history(input)

## Construct unified data as a TidySet
medhisdata=compile_mh_outcome(mh_table,outcome,'ICD-10 (2016)')

herdiantrisufriyana/medhist documentation built on June 24, 2021, 3:41 a.m.