knitr::opts_chunk$set(collapse = T, comment = "" ) options(tibble.print_min = 4L, tibble.print_max = 4L)
GDSC raw data is available as a download from http://www.cancerrxgene.org/downloads. The dose response curve fitting uses the non-linear mixed effects model described in Vis, D.J. et al. Pharmacogenomics 2016, 17(7):691-700 (https://www.ncbi.nlm.nih.gov/pubmed/27180993). To fit the data with the model use the R package gdscIC50 (https://github.com/CancerRxGene/gdscIC50).
GDSC drug screening uses 1536 well plates and 384 well plates.
Each plate is plated with a single cell line.
The plate layout, i.e., what happens in a given well position - is it a control, is it a drug treatment? - is called a drug set and is given a unique DRUGSET_ID
.
The screening plate tags within the drugset describe the treatment for a particular well. They allow wells to be grouped together by treatment type, e.g., drugged wells, control wells.
The tags are used as part of the lab procedure to instruct the liquid handling robotics on how to array drug treatments in a particular plate (the drugset); and as part of downstream analysis (QC, dose-response fitting, etc.)
There can be more than one tag per plate position. Thus in raw data files (csv) there may be more than one row per plate well position, e.g., L12-D1-S + DMSO
.
Single drug treatments are referred to as library drugs - L1
,L2
etc. Each library is used in a titration to elicit the dose response characteristics for that compound with a particular cell line.
Some GDSC data includes combination drug treatments. These are usually a titration of a library compound in combination with a fixed dose anchor compound - L1 + A1
etc. Combination treatment data are not currently available from www.cancerrxgene.org
For more details on the tags see below.
GDSC raw data is distributed as a csv file which can then be loaded as a data frame. The gdsc_example
dataset contains the minimum columns for the GDSC raw data to work wth the gdscIC50 package. Other GDSC data sets may contain additional columns. Not all well positions per plate are represented in public data sets because some drug treatments are part of private collaborations.
library(gdscIC50) data("gdsc_example") gdsc_example[99:100,]
Each row in the raw data represents a single well of a plate. However, there may be more than one row per well if there is more than one tag for that position in the drug set, e.g. this will happen if a well receives a combnination of treatments.
raw_data_description <- data.frame( Column_name = names(gdsc_example), Description = c( "Project name for the dataset", "Unique barcode for screening assay plate", "Unique id for the scan of the plate by the plate reader - fluorescence measurement data. A plate might be scanned more than once but only one `SCAN_ID` will pass internal QC. Therefore there is a one to one correspondence between `BARCODE` and `SCAN_ID` in the published data.", "Date that the plate was seeded with cell line.", "Date the experiment finished and measurement was taken (scanning).", "Unique GDSC identifier for the cell line expansion seeded on the plate. Each time a cell line is expanded from frozen stocks it is assigned a new `CELL_ID`.", "Unique GDSC identifier for the cell line seeded on the plate. A particular cell line will have a single `MASTER_CELL_ID` but can have multiple `CELL_ID`.", "Identifier of the cell line in the COSMIC database if available. There is a one to one correspondence between `MASTER_CELL_ID` and `COSMIC_ID`.", "Name of the plated cell line. Again this will have a one to one correspondence with `MASTER_CELL_ID`.", "Number of cells seeded per well of screening plate. This number is the same for all wells on a plate.", "The set of drugs used to treat the plate and the associated plate layout.", "End point assay type used to assess cell viability, e.g., `Glo` is *Promega CellTiter-Glo*.", "Duration of the assay in days from cell line drug treatment to end point measurement.", "Plate well position numbered row-wise. 1536 well plates have 48 columns and 384 well plates have 24.", "Label to identify well treatment - see description below. It is possible to have more than one tag per well `POSITION` such that in the raw data files (csv) there may be more than one row per plate well position, e.g., `L12-D1-S + DMSO`.", "Unique identifier for the drug used for treatment. In the absence of a drug treatment, e.g., a negative control this field will be `NA`.", "Micromolar concentration of the drug id used for treatment. As with `DRUG_ID` this field can be `NA`.", "Fluorescence measurement at the end of the assay. The fluorescence is a result of `ASSAY` and is an indicator of cell viability.") )
knitr::kable(raw_data_description, align = c('l','l'))
Drug treated wells will have entries in the DRUG_ID
and CONC
fields.
Experimental drug treatments are referred to as library drugs. Positive controls and reference compounds are also drug treatments.
In combination drug treatments there will be an anchor drug in addtion to a library drug.
-S
indicates a single treatment, -C
indicates a combination treatment.
Each library or anchor drug number (Lx
, Ax
) will correspond to a particular drug (DRUG_ID
). However, it is possible that the same DRUG_ID
will have been assigned to different library or anchor numbers, e.g., to distinguish replicate treatments.
The relative concentrations of library drug treatments are indicated by the dose (...-Dx-...
) such that dose D1
is the maximum concentration and all subsequent doses are dilutions thereof.
Examples of the tags currently in use is given below.
drug_treated_tags <- data.frame( `TAG` = c("L1-D1-S", "L2-D5-S", "A1-C", "A1-S", "R1-D1-S"), Description = c( "Library drug 1 at dose 1 (maximum concentration) as single agent treatment", "Library drug 2 alone (combination treatment) at dose 5 (the minimum in a 5 point titration)", "Anchor drug 1 in a combination", "Anchor drug 1 alone", "Reference compound used for comparison between screens") )
pander::pander(drug_treated_tags, justify = 'll')
control_tags <- data.frame( `TAG` = c("NC-0", "NC-1", "PC-1", "PC1-D1-S", "UN-USED", "B", "DMSO", "SC"), Description = c( "Negative control (no treatment)", "Negative control (treatment with DMSO)", "Positive control. No titration of this positive control in the drug set", "Positive control as part of a titration.", "Excluded from analysis (no cells). Usually wells at the plate edge.", "Blank (no drug, no cells, just media)", "Usually used with a drug treatment tag at the same position to indicate back-filling to a required volume.", "Cell seeding control with DMSO. A multiple of the cell seeding density used for the rest of the plate.") )
pander::pander(control_tags, justify = 'll')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.