ukbcase: ukbcase

Description Usage Arguments Author(s) Examples

View source: R/ukbcase.R

Description

ukbcase: find cases in UK Biobank

Usage

1
2
3
4
ukbcase(icd10 = NULL, icd9 = NULL, oper4 = NULL, histology = NULL,
  hesin = NULL, hesin_diag10 = NULL, hesin_oper4 = NULL,
  deathdata = NULL, cancerdata = NULL, icd10main = T, icd10sec = T,
  icd9main = T, oper4main = T, oper4sec = T, death = T, cancer = T)

Arguments

icd10

String array. International Classification of Diseases, 10th revision. For example, the icd10 for myocardial infarction is I21-I24,I25, written as c('I211','I212',...).

icd9

String array. International Classification of Diseases, 9th revision. For example, the icd9 for myocardial infarction is 410.

oper4

String array. Operative procedures which could also be used to imply specific diseases. For example, the oper4 K40-K46, K49, K50, K75 imply the onset of MI, written as c('K40','K41',...).

histology

String array. Histology code which could be used to imply specifc cancer. Meaningful only when cancerdata is available, and 'cancer==Ture'.

hesin

main hospital record dataset.

hesin_diag10

hospital record subdataset, containing diag10 except main diagnosis.

hesin_oper4

hospital record subdataset, containing oper4 except main opsc records.

deathdata

death dataset, eid, icd10 and diagnosis time should be recorded, and named as ('eid','case_of_death','date_of_death'), long formate dataset is required.

cancerdata

cancer dataset, eid, cancertype and time should be recorded, cancertype could be recorded either by icd10,icd9 or histology of the cancer, the corresponding names of column should be ('eid','date,'icd10','icd9','histology'), long formate dataset is required.

icd10main

logical; if TRUE, the main icd10 is considered.

icd10sec

logical; if TRUE, the secondary icd10 is considered.

icd9main

logical; if TRUE, the icd9 is considered.

oper4main

logical; if TRUE, the main oper4 is considered.

oper4sec

logical; if TRUE, the secondary oper4 is considered.

death

logical; if TRUE, the death dataset is considered.

cancer

logical; if TRUE, the cancer dataset is considered.

Author(s)

yixuan ye

ukbcase is a tool for identifying patients in UK biobank given the definition (ICD or OPCS) of disease phenotypes. Here we combine 3 data sources: hospital in-patient episode records (hesin), death records, and cancer records to firstly identify all the patients ID. And by comparing the date of all the records, we then identify the earlist onset date for each patient.

For more details about UK Biobank data sources related to health outcomes, see: http://biobank.ctsu.ox.ac.uk/crystal/label.cgi?id=100091

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## simulate hesin, hesin_diag10, hesin_oper4 dataset
hesin <- data.frame(eid=1:100,record_id=1:100,
  epistart=sample(c("2018-06-14",'2018-01-01'),100,replace=TRUE),
  diag_icd10=sample(c('I211','I212'),100,replace = TRUE), diag_icd9=sample(c('410','411'),100,replace=TRUE),
  oper4=sample(c('K400','K401'),100,replace = TRUE))
hesin_diag10 <- hesin[sample(1:100,80,replace = TRUE),c('eid','record_id','epistart')]
hesin_diag10$diag_icd10 <- sample(c('I211','I212','I213','I214'),80,replace = TRUE)
hesin_oper4 <- hesin[sample(1:100,80,replace = TRUE),c('eid','record_id','epistart')]
hesin_oper4$oper4 <- sample(c('K400','K401','K402','K403'),80,replace = TRUE)

## select cases by definition "icd10=I211 or oper4=K400"
icd10 <- 'I211'
oper4 <- 'K400'
case <- ukbcase(hesin=hesin,hesin_diag10=hesin_diag10,hesin_oper4=hesin_oper4,icd10=icd10,oper4=oper4)
summary(case)

yeyixuan/UKBCaseFinder documentation built on May 21, 2019, 9:39 a.m.