selectCases: Data integration for case selection

View source: R/selectCases.R

selectCasesR Documentation

Data integration for case selection

Description

This query function can select the cases matching defined conditions for analyses.

Usage

selectCases(
  dxDataFile,
  idColName,
  icdColName,
  dateColName,
  icdVerColName = NULL,
  icd10usingDate = NULL,
  groupDataType = CCS,
  customGroupingTable,
  isDescription = TRUE,
  caseCondition,
  caseCount = 2,
  periodRange = c(30, 365),
  caseName = "Selected"
)

Arguments

dxDataFile

A data frame object of clinical diagnostic data with at least 3 columns: ID, ICD, and Date. As for date column, the data format should be YYYY/MM/DD or YYYY-MM-DD.

idColName

Column name of ID column in dxDataFile. Data type of this argumant should be string without quotation marks.

icdColName

Column name of ICD column in dxDataFile. Data type of this argumant should be string without quotation marks.

dateColName

Column name of date column in dxDataFile, and the type of date column should be a date format in R or a string format with date information in YYYY/MM/DD or YYYY-MM-DD. Data type of this argumant should be string without quotation marks.

icdVerColName

(Optional) Column name if there is a columns to record ICD-9/10 version used in dxDataFile. In this column, data format should be numeric 9L or 10L to indicate which ICD version is used for each cell. See examples below to get more information.

icd10usingDate

The date that ICD-10 was started to be used in dxDataFile dataset. The data format should be YYYY/MM/DD or YYYY-MM-DD. Necessary if icdVerColName is null.

groupDataType

Five Stratified methods can be chosen: CCS (ccs), multiple-level CCS (ccslvl1, ccslvl2, CCSR (ccsr),ccslvl3, ccslvl4), PheWAS (PheWAS), comorbidities (ahrq,charlson, elix), precise or fuzzy customized method (customGrepIcdGroup, customIcdGroup). The value should be string stated above without quotation mark. Default value is ccs. When conducting cases selection by un-grouped ICD codes, then use the method: ICD (ICD).

customGroupingTable

Used-defined grouping categories. icdDxToCustom needs a dataset with two columns called "Group" and "ICD", respectively; User can define one or more disease categories in "Group" column, and define a list of corresponding category-related ICD codes in "ICD" column. icdDxToCustomGrep needs a dataset with two columns: "Group", "grepIcd"; "Group" defines one or more disease categories and "grepICD" defines disease-related ICD code character strings containing regular expressions.

isDescription

Binary. If true, category description of classification methods will be used in the group column. If false, category name will be used. By default, it is set to be True (standard category description).

caseCondition

Certain diseases to be selected. The condition can be specific ICD, CCS category description, etc. String with regular expression is also supported.

caseCount

Minimum number of diagnoses time to be selected. If caseCount = 2, then only patients who had been diagnosed twice (or above) would be selected. Default value is 1.

caseName

Value to identify selected or not. The value will be filled in the labeling column called selectedCase. By default, it is set to be "selected".

PeriodRange

Determine duration of interest for performing the case selection. By default, it is set from 30 to 365 days (with argument c(30,365)). The lower bound and the upper of the wanted duration should be coded as a vector.

Details

User can select cases by diagnostic categories, such as CCS category, ICD codes, etc. The function also provides the options to set the minimum number of diagnoses within a specific duration. The output dataset can be passed to 'groupedDataLongToWide' to create tables in wide format for statistical analytic usage.

Value

A new data.table based on standard classification dataset with a new column: selectedCase, in which each cell is labeled as selected or not. If the patient was diagnosed with certain diseases, but the selection condition is not satisfied, then the selectedCase cell will be labeled with a star (*).

See Also

Other data integration functions: splitDataByDate, getEligiblePeriod, getConditionEra

Examples

# sample file for example

head(sampleDxFile)

#select case with "Diseases of the urinary system" by level 2 of CCS classification

selectCases(dxDataFile = sampleDxFile,
            ID, ICD, Date,
            icdVerColName = NULL,
            groupDataType = ccslvl2,
            icd10usingDate = "2015/10/01",
            caseCondition = "Diseases of the urinary system",
            caseCount = 1)

DHLab-CGU/emr documentation built on Sept. 2, 2023, 9:16 p.m.