BiocStyle::markdown()
Sys.setenv("CCLEDB_PATH"="~/BigData/CellLineData/CellLineData.db") suppressPackageStartupMessages({ library(ccleR6) })
A large indexed SQLite database has been created by Phil Chapman to represent CCLE, Achilles, and other integrative data sources relevant to cancer biology.
This document describes some approaches to user interface design. We indicate how to
For this code to work you need to have the environment variable
CCLEDB_PATH
defined to give the path to the SQLite file. ie
Sys.setenv("CCLEDB_PATH"="path/to/db")
I believe that an object that is somewhat fleshed out relative to the database view provided by dplyr will come in handy. Therefore I defined a reference class and have lightly populated it with some identifier vectors.
ccle = ccledb$new(.ccleSrc) ccle
This can be serialized, but the database connection needs to be refreshed on load.
The "guide vectors" are created as hints on vocabularies in use.
"BRAF" %in% ccle$cngenes # copy number gene list
We have defined a number of filter functions to simplify common subsetting operations.
ccle$src %>% filter_compound("Irinotecan") ccle$src %>% filter_organ("breast") ccle$src %>% filter_Histology_patt("%wing%")
microbenchmark?
I extended the reference class itself to contain a query set that can be applied across different data sets.
ccle$query_symbols <- c('PTEN', 'BRAF', 'NRAS', 'KRAS', 'SMARCA4', 'PIK3CA') ccle$query_cell_lines <- (ccle$src %>% filter_organ("skin") %>% as.data.frame(n=-1))$CCLE_name ccle$query_compounds <- c('PLX4720', 'AZD6244', 'Lapatinib')
Data can then be retrieved by running a function against the R6 class
affy_data <- get_affy(ccle) head(affy_data) hybcap_data <- get_hybcap(ccle) head(hybcap_data) resp_data <- get_response(ccle) head(resp_data)
Importantly, since the data structure is standardised we can now combine disparate data types into a common data frame
combined_data <- bind_rows(affy_data,hybcap_data,resp_data)
The get_data
functions can be combined to give an output suitable for modeling using the make_df
function:
my_df <- make_df(ccle) my_df[1:5,1:10] summary(lm(AZD6244_resp ~ as.factor(BRAF_hybcap), data=my_df)) summary(lm(AZD6244_resp ~ as.factor(BRAF_hybcap) + as.factor(NRAS_hybcap), data=my_df))
Finally a convenient heatmap plot can be generated using the make_heatmap
function to visualise drug response vs genetic features.
make_heatmap(ccle, compound='AZD6244')
Two displays of the Barettina paper (Figures 4a [lower dotplot
with sensitivity against line] and 4c)
can be approached interactively using the multiWidget function.
Briefly, multiWidget(ccle)
will generate a browser window
with two panes, concatenated on one page. The first one allows
selection of a gene for which hybrid capture mutation information
has been obtained, and a compound. The IC50s for cell lines
with hybrid capture data are ordered and displayed.
The second panel allows selection of
a gene whose expression will be summarized across primary tumor sites.
library(grid) library(png) im = readPNG("images/shinyDemo.png") grid.raster(im)
The vocabularies used for cell lines and tumor anatomy are complicated and need harmonization and streamlining.
Full dose-response information is available and should be exposed.
Interfaces for filters and joins need to be specified and deployed for common use cases.
Additional interactivity such as tooltips over points that can describe mutation profiles.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.