README.md

This package is the R/shiny application aimed at facilitating the exploration of cohorts. These cohorts can be loaded from local files or extracted from clinical data warehouses (CDW) such as i2b2 or Dr.Warehouse. It interactively performs multiple phenome-wide scans on different types of data :diagnosis codes (ICD10), biological test results (numerical), and UMLS concepts (possibly extracted from free text clinical notes). When connected to a CDW: based on a cohort of 'cases', it can run the analyses using a manually selected 'controls' cohort or it can randomly select a control cohort from the CDW (by matching a predefined number of controls for each case).

The tool can be tested with preloaded data on this website: https://aneuraz.shinyapps.io/phewas_app_demo/

Setup

Installation

As the package is not deployed on CRAN yet, please install from github:

devtools::install_github('aneuraz/multiWAS')

For CDW connection

Prerequisite: DWHtools2

We will assume here that you have already configured the tools needed for the connection to the database. The package can handle Oracle and Postgres back-ends.

if(!require(devtools)) install.packages('devtools')
devtools::install_github('aneuraz/DWHtools2')

Config file

The first thing to do to use this package is to create a text file containing the information for the connection to the database. Here is a description of this file:

An example for Oracle:

driverClass="oracle.jdbc.OracleDriver"
classPath="<INSTANTCLIENT_DIRECTORY>/ojdbc6.jar"
connectPath="jdbc:oracle:thin:@<URL>:<PORT>/<DBNAME>"
dbuser="<USER>"
dbpass="<PWD>"
username="DWHUSER"
backend="drwh_oracle"
````

## Data format for local mode (loading data from csv files): 

Data must be in csv files (comma separator). You will need 1 file per modality (e.g. umls concepts, icd codes, labtest results) plus 1 file for patients informations. 

### Patients data
Columns for patients data: 

- PATIENT_NUM: unique identifier for the patient
- BIRTH_YEAR: year of birth
- SEX: "M" for male, "F" for female
- group: "cases" or "control"

### UMLS concepts data: 

columns for umls data:

- PATIENT_NUM: unique identifier for the patient
- CONCEPT_CERTITUDE: 1/-1 polarity of the concept (negation or not). 1 means affirmative; -1 means negative
- CODE: CUI of the concept
- CODE_LABEL: label of the concept
- PARENT_LABEL: label of a parent of the concept in the UMLS (if multiple parents exist in the UMLS, please choose one)
- group: "cases" or "control"

### ICD codes: 

- PATIENT_NUM: unique identifier for the patient
- CODE: ICD code
- CODE_LABEL: ICD code label
- PARENT_LABEL: label of the parent of the code
- PARENT_CODE: code of the parent
- group: "cases" or "control"

### Biological tests results: 

- PATIENT_NUM: unique identifier for the patient
- CODE: code of the labtest
- ENCOUNTER_NUM: unique identifier for the encounter (venue) of the patient
- CODE_LABEL: label of the labtest
- PARENT_LABEL: label of the parent (a category of tests for example)
- INF: 1/0. 1 if the result was below the normal range
- SUP: 1/0. 1 if the result was above the normal range
- group: "cases" or "control"

## Usage

### Launch the application

#### Local mode (loading data from csv files): 

To launch the application, just run the following command: 

```r
library(multiWAS)
call_phewas(.local= TRUE)

CDW mode (extracting data from a CDW):

To launch the application, just run the following command:

library(multiWAS)
call_phewas(.config_file = '<PATH_TO_CONFIG_FILE>')

Run an analysis

Select the patients

Analysis parameters

Few parameters are available.

Explore the results

Interactive manhattan plot

Interactive dumbbell plot

Interactive table

Export the results

You can export the report of the results in HTML using the Report (.html) button.

You can also export the results data in RData format using the Result data (.RData) button



aneuraz/multiWAS documentation built on May 14, 2019, 2:37 p.m.