addCategoryToCohort: Creates one row per patient from repeated categorical data

Description Usage Arguments Details Value See Also Examples

View source: R/addCategoryToCohort.R

Description

Adds to the cohort data.table a column labelled varname containing the value of a category from a list of anonpatid, category, eventdate. If a patient has more than one record, the category of choice is chosen according to an order of priority, or a binary outcome is generated if the patient has any records fulfilling the criteria.

Usage

1
2
3
4
5
addCategoryToCohort(cohort, varname, data, old_varname = 'category',
    categories, binary = FALSE,
    limit_years = c(-Inf, 0),  idcolname = attr(cohort, 'idcolname'),
    datecolname = 'eventdate', indexcolname = 'indexdate',
    overwrite = TRUE, description = NULL, limit_days = NULL)

Arguments

cohort

a cohort object

varname

new variable name

data

ffdf or data.table containing patient identifier, eventdate and old_varname.

old_varname

the column name containing the categories of interest

categories

vector of categories to use, in priority order (highest priority first). For each patient, all records within limit_years of the index date are searched for the highest priority category first, then the next highest etc.

If the result is binary, the order of categories does not matter, and the function returns TRUE / FALSE according to whether the patient has any records of the categories of interest within limit_years of the index date.

binary

whether to lump all categories together to make a binary variable (TRUE / FALSE). If there are no records for a patient the result is FALSE, not missing.

limit_years

earliest and latest year relative to index date. Default is c(-Inf, 0), which searches all events prior to or on the index date.

idcolname

name of the patient identifier column in data

datecolname

name of the event date column in data

indexcolname

name of the index date column in the cohort dataset

overwrite

whether to overwrite this variable if it exists

description

description for the new variable. Defaults to the function call which generated this variable.

limit_days

a vector of length 2 for the time limits, which over-rules limit_years if both are supplied. A year is considered to be 365.25 days.

Details

This is a convenience function which calls addToCohort and then converts the output to a TRUE / FALSE variable (where lack of an entry is converted to FALSE instead of a missing value).

The function selects events with the relevant categories and entity types. The subset of relevant events is then used in a call to addToCohort, to select a category if it occurs within the defined time window relative to each patient's index date. The final variable is added to the cohort data.table.

Value

Cohort with a extra column(s). If cohort is a data.table, it is also modified by reference.

See Also

addToCohort, addCodelistToCohort

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
COHORT <- cohort(data.table(anonpatid = 1:3,
    indexdate = as.IDate(c("2012-1-3", "2012-1-2", "2010-1-9"))))
print(COHORT)

# New data
newdata <- data.table(anonpatid = c(2, 2, 3, 3, 4, 4, 4), 
    medcode = c(1, 2, 2, 3, 1, 2, NA), eventdate = as.IDate(c("2000-1-1", 
    "2012-1-3", "2011-1-1", "2011-1-1", "2012-1-5", "2013-1-1", 
    "2011-1-1")), category = c(1, 2, 1, 3, 2, 3, 4))

# Using data.table, categories 1 or 2 (1 priority)
addCategoryToCohort(COHORT, varname = "newvar", data = newdata,
    categories = 1:2)
print(COHORT)
removeColumns(COHORT, 'newvar')

# Using ffdf
newffdf <- as.ffdf(newdata)
FFDFCOHORT <- as.ffdf(COHORT)

# Category 1 only
addCategoryToCohort(COHORT, varname = "V_1", data = newffdf,
    categories = 1)

# Category 2 or 1
addCategoryToCohort(COHORT, varname = "V_12", data = newffdf,
    categories = 2:1)

# Binary
addCategoryToCohort(COHORT, varname = "V_binary", data = newffdf,
    categories = 1:2, binary = TRUE)

# Category 2 or 1, no time limits
addCategoryToCohort(COHORT, varname = "V_12anytime", data = newffdf,
    categories = 2:1, limit_years = c(-Inf, Inf))
print(COHORT)

# Using FFDF cohort; need to reassign the result to the 
# cohort object using <-
# Category 1 only
FFDFCOHORT <- addCategoryToCohort(FFDFCOHORT, varname = "V_1",
    data = newffdf, categories = 1)

# Category 2 or 1
FFDFCOHORT <- addCategoryToCohort(FFDFCOHORT, varname = "V_12",
    data = newffdf, categories = 2:1)
print(FFDFCOHORT)

CALIBERdatamanage documentation built on May 31, 2017, 3:41 a.m.