knitr::opts_chunk$set(collapse = T,
                      comment = ""
                      )
options(tibble.print_min = 4L, tibble.print_max = 4L)

GDSC raw data is available as a download from http://www.cancerrxgene.org/downloads. The dose response curve fitting uses the non-linear mixed effects model described in Vis, D.J. et al. Pharmacogenomics 2016, 17(7):691-700 (https://www.ncbi.nlm.nih.gov/pubmed/27180993). To fit the data with the model use the R package gdscIC50 (https://github.com/CancerRxGene/gdscIC50).

Experimental design

For more details on the tags see below.

GDSC raw data format

GDSC raw data is distributed as a csv file which can then be loaded as a data frame. The gdsc_example dataset contains the minimum columns for the GDSC raw data to work wth the gdscIC50 package. Other GDSC data sets may contain additional columns. Not all well positions per plate are represented in public data sets because some drug treatments are part of private collaborations.

library(gdscIC50)
data("gdsc_example")
gdsc_example[99:100,]

GDSC raw data definitions

Each row in the raw data represents a single well of a plate. However, there may be more than one row per well if there is more than one tag for that position in the drug set, e.g. this will happen if a well receives a combnination of treatments.

raw_data_description <- data.frame(
  Column_name = names(gdsc_example),
           Description = c(
             "Project name for the dataset",
             "Unique barcode for screening assay plate", 
             "Unique id for the scan of the plate by the plate reader - fluorescence measurement data. A plate might be scanned more than once but only one `SCAN_ID` will pass internal QC. Therefore there is a one to one  correspondence between `BARCODE` and `SCAN_ID` in the published data.",
             "Date that the plate was seeded with cell line.",
             "Date the experiment finished and measurement was taken (scanning).",
             "Unique GDSC identifier for the cell line expansion seeded on the plate. Each time a cell line is expanded from frozen stocks it is assigned a new `CELL_ID`.",
             "Unique GDSC identifier for the cell line seeded on the plate. A particular cell line will have a single `MASTER_CELL_ID` but can have multiple `CELL_ID`.",
             "Identifier of the cell line in the COSMIC database if available. There is a one to one correspondence between `MASTER_CELL_ID` and `COSMIC_ID`.",
             "Name of the plated cell line. Again this will have a one to one correspondence with `MASTER_CELL_ID`.",
             "Number of cells seeded per well of screening plate. This number is the same for all wells on a plate.",
             "The set of drugs used to treat the plate and the associated plate layout.",
             "End point assay type used to assess cell viability, e.g., `Glo` is *Promega CellTiter-Glo*.",
             "Duration of the assay in days from cell line drug treatment to end point measurement.",
             "Plate well position numbered row-wise. 1536 well plates have 48 columns and 384 well plates have 24.",
             "Label to identify well treatment - see description below. It is possible to have more than one tag per well `POSITION` such that in the raw data files (csv) there may be more than one row per plate well position, e.g., `L12-D1-S + DMSO`.",
             "Unique identifier for the drug used for treatment. In the absence of a drug treatment, e.g., a negative control this field will be `NA`.",
             "Micromolar concentration of the drug id used for treatment. As with `DRUG_ID` this field can be `NA`.",
             "Fluorescence measurement at the end of the assay. The fluorescence is a result of `ASSAY` and is an indicator of cell viability.")
  )
knitr::kable(raw_data_description, align = c('l','l'))

The TAG column

Examples of the tags currently in use is given below.

Drug treated wells:

drug_treated_tags <- data.frame(
  `TAG` = c("L1-D1-S", "L2-D5-S", "A1-C", "A1-S", "R1-D1-S"),
  Description = c(
    "Library drug 1 at dose 1 (maximum concentration) as single agent treatment",
    "Library drug 2 alone (combination treatment) at dose 5 (the minimum in a 5 point titration)",
    "Anchor drug 1 in a combination",
    "Anchor drug 1 alone",
    "Reference compound used for comparison between screens")
)
pander::pander(drug_treated_tags, justify = 'll')

Control wells:

control_tags <- data.frame(
  `TAG` = c("NC-0", "NC-1", "PC-1", "PC1-D1-S", "UN-USED", "B", "DMSO", "SC"),
  Description = c(
    "Negative control (no treatment)",
    "Negative control (treatment with DMSO)",
    "Positive control. No titration of this positive control in the drug set",
    "Positive control as part of a titration.",
    "Excluded from analysis (no cells). Usually wells at the plate edge.",
    "Blank (no drug, no cells, just media)",
    "Usually used with a drug treatment tag at the same position to indicate back-filling to a required volume.",
    "Cell seeding control with DMSO. A multiple of the cell seeding density used for the rest of the plate.")
)
pander::pander(control_tags, justify = 'll')


CancerRxGene/gdscIC50 documentation built on Oct. 6, 2022, 2:40 a.m.