extractEntity: Extract and decode data from a CPRD GOLD format dataset

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/extractEntity.R

Description

Decodes additional clinical details (data1, data2, ... columns) in raw CPRD data using the lookups. CALIBER users should use the CALIBERlookups package, but it is also possible to supply the lookups directly.

Usage

1
2
extractEntity(data, enttype, CALIBER_ENTITY = NULL,
    CALIBER_LOOKUPS = NULL, ...)

Arguments

data

data.table or FFDF data frame containing the CPRD GOLD format data.

enttype

entity type(s) to extract, either a single integer or a vector. If extracting multiple entity types, the data specification for each entity type must be identical.

CALIBER_ENTITY

a table with columns enttype, data_fields, data1, data1_lkup, .... This argument can be omitted if the CALIBERlookups package is installed.

CALIBER_LOOKUPS

a table with columns lookup, category, description. This argument can be omitted if the CALIBERlookups package is installed.

...

other arguments to pass to YYYYMMDDtoDate for extracting dates from YYYYMMDD format.

Details

If the CALIBERlookups package is not installed, CALIBER_ENTITY and CALIBER_LOOKUPS must be supplied.

CALIBER_ENTITY states what each of the data entries contains for a particular entity type, and is a data.table with columns:

enttype

integer vector, key column, link to test or clinical table

description

a character vector

filetype

a character vector, ‘Clinical’ or ‘Test’

category

a character vector

data_fields

number of data fields used

data1

definition of data1

data1_lkup

lookup table for data1

data2

definition of data2

data2_lkup

lookup table for data2

data3

definition of data3

data3_lkup

lookup table for data3

data4

definition of data4

data4_lkup

lookup table for data4

data5

definition of data5

data5_lkup

lookup table for data5

data6

definition of data6

data6_lkup

lookup table for data6

data7

definition of data7

data7_lkup

lookup table for data7

data8

definition of data8

data8_lkup

lookup table for data8

If there are fewer than 8 data fields, the additional data columns are not required.

CALIBER_LOOKUPS contains the interpretation of each of the lookup categories, and is a data.table with columns:

lookup

character vector, first key column

category

integer vector, second key column

description

character vector

Value

extractEntity returns a data.table with interpreted entity data. The exact columns depend on the entity type.

extractMedcodes returns a data.table or ffdf depending on the format of the original data. The new column named varname is a factor with levels given by the category labels (shortnames) in the codelist.

Author(s)

Anoop Shah

See Also

addCodelistToCohort, addToCohort, extractCodes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
data(test_data)
TESTDT <- data.table(test_data)
convertDates(TESTDT)
TESTFFDF <- as.ffdf(TESTDT)

data(test_entity)
ENTITY <- as.data.table(test_entity)

data(test_lookups)
LOOKUPS <- as.data.table(test_lookups)
LOOKUPS[, description := as.character(description)]
LOOKUPS[, lookup := as.character(lookup)]

extractEntity(TESTFFDF, 1, ENTITY, LOOKUPS)
extractEntity(TESTDT, 1, ENTITY, LOOKUPS)

extractEntity(TESTFFDF, 4, ENTITY, LOOKUPS)
extractEntity(TESTDT, 4, ENTITY, LOOKUPS)

extractEntity(TESTFFDF, 5, ENTITY, LOOKUPS)
extractEntity(TESTDT, 5, ENTITY, LOOKUPS)

extractEntity(TESTFFDF, 151, ENTITY, LOOKUPS)
extractEntity(TESTDT, 151, ENTITY, LOOKUPS)

CALIBERdatamanage documentation built on Nov. 23, 2021, 3 p.m.