library(CollateralVulnerability2016) library(dplyr)
work_dir <- '~/BigData/CollateralVulnerability2016/paad/' download_dir <- '~/BigData/RTCGA_downloads/' mydb <- 'paad_output.db' mydb_path <- paste0(work_dir, mydb) my_con <- setupSQLite ( mydb_path )
In this case we want all human genes using the most recent ENSEMBL annotation:
all_genes <- getAllHumanGenes(my_con) head(all_genes)
Gene data also written to SQLite database and can be viewed with dplyr:
src_sqlite(my_con@dbname) %>% tbl('human_genes')
Want to import the TCGA RNAseq and mutation data so that it is ready to use for further analyses
rnaseq_dat <- getTCGARNAseqData(my_con, cancerTypes='PAAD', releaseDate = '2015-11-01', sampletag=c('01A', '06A')) dim(rnaseq_dat) mutation_dat <- getTCGAMutationData(my_con, cancerTypes='PAAD', releaseDate = '2015-11-01', sampletag=c('01A', '06A')) dim(mutation_dat) DBI::dbListTables(my_con)
This is a wrapper around the BISEP function from the BISEP package. For testing can set gene_n to a lower number, setting a high number ensures that all genes are included in the analysis. Some filtering is done to exclude non-variant or non-expressed genes: either by variance or by number of genes with an NA. Can optionally log2 transform the data or even use a non-standard input table. The analysis is parallelised for speed but this has only been tested on Mac OSX and Linux. See ?doBISEPAnalysis
for more details.
bisep_output <- doBISEPAnalysis(my_con, genes_n=100000, table_name = "tcga_rnaseq_data", log2_transform=TRUE, gene_var_th=0.2, gene_naprop_th=0.3)
At this step we can optionally do an initial filtering of the bisep results use the plotBISEPOutput
function to visualise this. We then use these genes for the next steps. Here we are visualising based on the pi_value (proportion of samples in the low expressors group) and p-value for the bimodal distribution test - see BISEP documentation for more info. However, we take all genes forward for the later steps so that we have data for all genes for all analyses.
plotBISEPOutput(my_con, pi_value_th=0.2, bisep_pval_th = 0.1) bisep_results <- src_sqlite(my_con@dbname) %>% dplyr::tbl('bisep_results') %>% # dplyr::filter(bisep_pval < 0.1 & pi_value <0.2) %>% dplyr::collect() %>% dplyr::arrange(gene_name) bisep_results
doRNAseqPlot(my_con, 'ENSG00000176024')
human_paralog_res <- countHumanParalogs(my_con, bisep_results$gene_id)
``` {r eval = FALSE} flymine_res <- doFlyMineAnalysis(my_con, bisep_results$gene_id)
## Step 9: Do the WormMine analysis ``` {r eval = FALSE} wormmine_res <- doWormMineAnalysis(my_con, bisep_results$gene_id)
``` {r eval = FALSE} mut_res <- doMutationAnalysis(my_con, bisep_results$gene_id)
## Step 11: Combine the results ``` {r eval=FALSE} combo_res <- combineResults(my_con, bisep_results$gene_id)
``` {r eval=FALSE} filtered_results <- src_sqlite(my_con@dbname) %>% dplyr::tbl('combined_results') %>% dplyr::filter(bisep_pval <= 0.1, pi_value <= 0.2, count_paralogs > 0, count_paralogs <= 5, (lethal_pct_fly >= 20 | lethal_pct_worm >= 20) ) %>% dplyr::collect()
## Step 13: Show results in shiny app ``` {r eval=FALSE} shinyVisApp(my_con)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.